Research Collection School Of Computing and Information Systems

Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Chao NI, Zhejiang University
Xin XIA, Monash University
David LO, Singapore Management UniversityFollow
Xiang CHEN, Nanjing University
Qing GU, Nanjing University

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

6-2020

Abstract

Cross-project defect prediction (CPDP), aiming to apply defect prediction models built on source projects to a target project, has been an active research topic. A variety of supervised CPDP methods and some simple unsupervised CPDP methods have been proposed. In a recent study, Zhou et al. found that simple unsupervised CPDP methods (i.e., ManualDown and ManualUp) have a prediction performance comparable or even superior to complex supervised CPDP methods. Therefore, they suggested that the ManualDown should be treated as the baseline when considering non-effort-aware performance measures (NPMs) and the ManualUp should be treated as the baseline when considering effort-aware performance measures (EPMs) in future CPDP studies. However, in that work, these unsupervised methods are only compared with existing supervised CPDP methods in terms of one or two NPMs and the prediction results of baselines are directly collected from the primary literature. Besides, the comparison has not considered other recently proposed EPMs, which consider context switches and developer fatigue due to initial false alarms. These limitations may not give a holistic comparison between the supervised methods and unsupervised methods. In this paper, we aim to revisit Zhou et al.'s study. To the best of our knowledge, we are the first to make a comparison between the existing supervised CPDP methods and the unsupervised methods proposed by Zhou et al. in the same experimental setting, considering both NPMs and EPMs. We also propose an improved supervised CPDP method EASC and make a further comparison between this method and the unsupervised methods. According to the results on 82 projects in terms of 12 performance measures, we find that when considering NPMs, EASC can achieve similar results with the unsupervised method ManualDown without statistically significant difference in most cases. However, when considering EPMs, our proposed supervised method EASC can statistically significantly outperform the unsupervised method ManualUp with a large improvement in terms of Cliff's delta in most cases. Therefore, the supervised CPDP methods are more promising than the unsupervised method in practical application scenarios, since the limitation of testing resource and the impact on developers cannot be ignored in these scenarios.

Keywords

Defect prediction, supervised model, unsupervised model, cross-project

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Software Engineering

First Page

Last Page

ISSN

0098-5589

Identifier

10.1109/TSE.2020.3001739

Publisher

IEEE

Embargo Period

5-11-2021

Citation

NI, Chao; XIA, Xin; LO, David; CHEN, Xiang; and GU, Qing. Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction. (2020). IEEE Transactions on Software Engineering. 1-16.
Available at: https://ink.library.smu.edu.sg/sis_research/5927

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/TSE.2020.3001739

Download

Find it in your library

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISSN

Identifier

Publisher

Embargo Period

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISSN

Identifier

Publisher

Embargo Period

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links