Publication Type

PhD Dissertation

Publication Date

5-2017

Abstract

Software systems are often released with bugs due to system complexity and inadequate testing. Bug resolving process plays an important role in development and evolution of software systems because developers could collect a considerable number of bugs from users and testers daily. For instance, during September 2015, the Eclipse project received approximately 2,500 bug reports, averaging 80 new reports each day. To help developers effectively address and manage bugs, bug tracking systems such as Bugzilla and JIRA are adopted to manage the life cycle of a bug through bug report. Since most of the information related to bugs are stored in software repositories, e.g., bug tracking systems, version control repositories, mailing list archives, etc. These repositories contain a wealth of valuable information, which could be mined to automate bug management process and thus save developers time and effort. In this thesis, I target the automation of three bug management tasks, i.e., bug prioritization, bug assignment, and stable related patch identification. Bug prioritization is important for developers to ensure that important reports are prioritized and fixed first. For automated bug prioritization, we propose an approach that recommends a priority level based on information available in bug reports by considering multiple factors, including temporal, textual, author, relatedreport, severity, and product, that potentially affect the priority level of a bug report. After being prioritized, each reported bug must be assigned to an appropriate developer/ team for handling the bug. This bug assignment process is important, because assigning a bug report to the incorrect developer or team can increase the overall time required to fix the bug, and thus increase project maintenance cost. Moreover, this process is time consuming and non-trivial since good comprehension of bug report, source code, and team members is needed. To automate bug assignment process, we propose a unified model based on learning to rank technique. The unified model naturally combines location-based information and activity-based information extracted from historical bug reports and source code for more accurate recommendation. After developers have fixed their bugs, they will submit patches that could resolve the bugs to bug tracking systems. The submitted patches will be reviewed and verified by other developers to make sure their correctness. In the last stage of bug management process, verified patches will be applied on the software code. In this stage, many software systems prefer to maintain multiple versions of software systems. For instance, developers of the Linux kernel release new versions, including bug fixes and new features, frequently, while maintaining some older “longterm” versions, which are stable, reliable, and secure execution environment to users. The maintaining of longterm versions raises the problem of how to identify patches that are submitted to the current version but should be backported to the longterm versions as well. To help developer find patches that should be moved to the longterm stable versions, we present two approaches that could automatically identify bug fixing patches based on the changes and commit messages recorded in code repositories. One approach is based on hand-crafted features and two machine learning techniques, i.e., LPU (Learning from Positive and Unlabeled Examples) and SVM (Support Vector Machine). The other approach is based on a convolutional neural network (CNN), which automatically learns features from patches.

Keywords

software engineer, mining software repositories, software maintenance, software bug, software bug management, automated software engineering

Degree Awarded

PhD in Information Systems

Discipline

Information Security | Software Engineering

Supervisor(s)

LO, David

Publication

Singapore Management University

City or Country

Singapore

Copyright Owner and License

Singapore Management University

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS