Publication Type

Conference Proceeding Article

Publication Date



To track bugs that appear in a software, developers often make use of a bug tracking system. Users can report bugs that they encounter in such a system. Bug reporting is inherently an uncoordinated distributed process though and thus when a user submits a new bug report, there might be cases when another bug report describing exactly the same problem is already present in the system. Such bug reports are duplicate of each other and these duplicate bug reports need to be identified. A number of past studies have proposed a number of automated approaches to detect duplicate bug reports. However, these approaches are not integrated to existing bug tracking systems. In this paper, we propose a tool named DupFinder, which implements the state-of-the-art unsupervised duplicate bug report approach by Runeson et al., as a Bugzilla extension. DupFinder does not require any training data and thus can easily be deployed to any project. DupFinder extracts texts from summary and description fields of a new bug report and recent bug reports present in a bug tracking system, uses vector space model to measure similarity of bug reports, and provides developers with a list of potential duplicate bug reports based on the similarity of these reports with the new bug report. We have released DupFinder as an open source tool in GitHub, which is available at:


Bugzilla, Duplicate bug reports, Integrated tool support


Information Security | Software Engineering

Research Areas

Software and Cyber-Physical Systems


ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering: September 15-19, 2014, Västerås, Sweden

First Page


Last Page








City or Country

New York

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Additional URL