Publication Type

Journal Article

Version

publishedVersion

Publication Date

11-2025

Abstract

Bug localization is a crucial aspect of software maintenance, running through the entire software lifecycle. Information retrieval-based bug localization (IRBL) identifies buggy code based on bug reports, expediting the bug resolution process for developers. Recent years have witnessed significant achievements in IRBL, propelled by the widespread adoption of deep learning (DL). To provide a comprehensive overview of the current state of the art and delve into key issues, we conduct a survey encompassing 61 IRBL studies leveraging DL. We summarize best practices in each phase of the IRBL workflow, undertake a meta-analysis of prior studies, and suggest future research directions. This exploration aims to guide further advancements in the field, fostering a deeper understanding and refining practices for effective bug localization. Our study suggests that the integration of DL in IRBL enhances the model’s capacity to extract semantic and syntactic information from both bug reports and source code, addressing issues such as lexical gaps, neglect of code structure information, and cold-start problems. Future research avenues for IRBL encompass exploring diversity in programming languages, adopting fine-grained granularity, and focusing on real-world applications. Most importantly, although some studies have started using large language models for IRBL, there is still a need for more in-depth exploration and thorough investigation in this area.

Keywords

Bug localization, Deep learning, Information retrieval, Survey

Discipline

Artificial Intelligence and Robotics | Databases and Information Systems | Software Engineering

Research Areas

Data Science and Engineering; Cybersecurity; Software and Cyber-Physical Systems

Publication

ACM Computing Surveys

Volume

57

Issue

11

ISSN

0360-0300

Identifier

10.1145/3734217

Publisher

Association for Computing Machinery (ACM)

Share

COinS