Conference Proceeding Article
Issue tracking systems are valuable resources during software maintenance activities and contain information about the issues faced during the development of a project as well as after its release. Many projects receive many reports of bugs and it is challenging for developers to manually debug and fix them. To mitigate this problem, past studies have proposed information retrieval (IR)-based bug localization techniques, which takes as input a textual description of a bug stored in an issue tracking system, and returns a list of potentially buggy source code files. These studies often evaluate their effectiveness on issue reports marked as bugs in issue tracking systems, using as ground truth the set of files that are modified in commits that fix each bug. However, there are a number of potential biases that can impact the validity of the results reported in these studies. First, issue reports marked as bugs might not be reports of bugs due to error in the reporting and classification process. Many issue reports are about documentation update, request for improvement, refactoring, code cleanups, etc. Second, bug reports might already explicitly specify the buggy program files and for these reports bug localization techniques are not needed. Third, files that get modified in commits that fix the bugs might not contain the bug. This study investigates the extent these potential biases affect the results of a bug localization technique and whether bug localization researchers need to consider these potential biases when evaluating their solutions. In this paper, we analyse issue reports from three different projects: HTTPClient, Jackrabbit, and Lucene-Java to examine the impact of above three biases on bug localization. Our results show that one of these biases significantly and substantially impacts bug localization results, while the other two biases have negligible or minor impact.
Issue Reports, Bug Localization, Bias, Empirical Study
Information Security | Software Engineering
Software and Cyber-Physical Systems
ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering: September 15-19, 2014, Västerås, Sweden
City or Country
Kochhar, Pavneet Singh; Tian, Yuan; and LO, David.
Potential Biases in Bug Localization: Do they Matter?. (2014). ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering: September 15-19, 2014, Västerås, Sweden. 803-814. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/2425
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.