Publication Type

Journal Article

Version

publishedVersion

Publication Date

12-2023

Abstract

Code search is a common yet important activity of software developers. An efficient code search model can largely facilitate the development process and improve the programming quality. Given the superb performance of learning the contextual representations, deep learning models, especially pre-trained language models, have been widely explored for the code search task. However, studies mainly focus on proposing new architectures for ever-better performance on designed test sets but ignore the performance on unseen test data where only natural language queries are available. The same problem in other domains, e.g., CV and NLP, is usually solved by test input selection that uses a subset of the unseen set to reduce the labeling effort. However, approaches from other domains are not directly applicable and still require labeling effort. In this article, we propose the kNN-based performance testing (KAPE) to efficiently solve the problem without manually matching code snippets to test queries. The main idea is to use semantically similar training data to perform the evaluation. Extensive experiments on six programming language datasets, three state-of-the-art pre-trained models, and seven baseline methods demonstrate that KAPE can effectively assess the model performance (e.g., CodeBERT achieves MRR 0.5795 on JavaScript) with a slight difference (e.g., 0.0261).

Keywords

Deep code search, software testing, deep learning testing, test selection

Discipline

Programming Languages and Compilers | Software Engineering

Research Areas

Information Systems and Management

Areas of Excellence

Digital transformation

Publication

ACM Transactions on Software Engineering and Methodology

Volume

Issue

First Page

Last Page

ISSN

1049-331X

Identifier

10.1145/3624735

Publisher

Association for Computing Machinery (ACM)

Citation

GUO, Yuejun; HU, Qiang; XIE, Xiaofei; MAXIME, Cordy; PAPADAKIS, Mike; and LE TRAON, Yves. KAPE: kNN-based performance testing for deep code search. (2023). ACM Transactions on Software Engineering and Methodology. 33, (2), 1-24.
Available at: https://ink.library.smu.edu.sg/sis_research/9093

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3624735

Download

Included in

Programming Languages and Compilers Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

KAPE: kNN-based performance testing for deep code search

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

KAPE: kNN-based performance testing for deep code search

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links