Research Collection School Of Computing and Information Systems

An Empirical Study of Classifier Combination on Cross-Project Defect Prediction

Yun ZHANG, Zhejiang University
David LO, Singapore Management UniversityFollow
Xin XIA, Zhejiang University
Jianling SUN, Zhejiang University

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

7-2015

Abstract

To help developers better allocate testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes that are more likely to be buggy based on past history of buggy classes. These techniques work well as long as a sufficient amount of data is available to train a prediction model. However, there is rarely enough training data for new software projects. To deal with this problem, cross-project defect prediction, which transfers a prediction model trained using data from one project to another, has been proposed and is regarded as a new challenge for defect prediction. So far, only a few cross-project defect prediction techniques have been proposed. To advance the state-of-the-art, in this work, we investigate 7 composite algorithms, which integrate multiple machine learning classifiers, to improve cross-project defect prediction. To evaluate the performance of the composite algorithms, we perform experiments on 10 open source software systems from the PROMISE repository which contain a total of 5,305 instances labeled as defective or clean. We compare the composite algorithms with CODEP Logistic, which is the latest cross-project defect prediction algorithm proposed by Panichella et al., in terms of two standard evaluation metrics: cost effectiveness and F-measure. Our experiment results show that several algorithms outperform CODEP Logistic: Max performs the best in terms of F-measure and its average F-measure outperforms that of CODEP Logistic by 36.88%. Bagging J48 performs the best in terms of cost effectiveness and its average cost effectiveness outperforms that of CODEP Logistic by 15.34%.

Keywords

Defect Prediction, Cross-Project, Classifier Combination

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

2015 IEEE 39th Annual Computer Software and Applications Conference: Taichung, Taiwan, July 1-5

First Page

264

Last Page

269

ISBN

9781467365642

Identifier

10.1109/COMPSAC.2015.58

Publisher

IEEE

City or Country

Piscataway, NJ

Citation

ZHANG, Yun; David LO; XIA, Xin; and SUN, Jianling. An Empirical Study of Classifier Combination on Cross-Project Defect Prediction. (2015). 2015 IEEE 39th Annual Computer Software and Applications Conference: Taichung, Taiwan, July 1-5. 264-269.
Available at: https://ink.library.smu.edu.sg/sis_research/3099

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/COMPSAC.2015.58

Download

Find it in your library

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

An Empirical Study of Classifier Combination on Cross-Project Defect Prediction

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

An Empirical Study of Classifier Combination on Cross-Project Defect Prediction

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links