Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
5-2021
Abstract
As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of ExactMatch and a 4.8% to 14.4% boost in terms of GLEU.
Keywords
Data Mining, Deep Learning, Query Logs, Query Reformulation, Stack Overflow
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, May 22-30
First Page
1273
Last Page
1285
ISBN
9780738113197
Identifier
10.1109/ICSE43902.2021.00116
Publisher
IEEE Computer Society
City or Country
Los Alamitos, CA
Citation
CAO, Kaibo; CHEN, Chunyang; BALTES, Sebastian; TREUDE, Christoph; and CHEN, Xiang.
Automated query reformulation for efficient search based on query logs from stack overflow. (2021). Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, May 22-30. 1273-1285.
Available at: https://ink.library.smu.edu.sg/sis_research/8848
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICSE43902.2021.00116