Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2022
Abstract
Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful expert knowledge. Simple models and complex models seem complementary to each other to some extent. To utilize the advantages of both simple and complex models, we propose a combined model namely SimCom by fusing the prediction scores of one simple and one complex model. The experimental results show that our approach can significantly outperform the state-of-the-art by 6.0-18.1%. In addition, our experimental results confirm that the simple model and complex model are complementary to each other.
Keywords
Deep learning, Semantics, Predictive models, Feature extraction
Discipline
Databases and Information Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the 30th International Conference on Program Comprehension, Virtual Event, 2022 May 16-17
First Page
229
Last Page
240
Identifier
10.1145/3524610.3527910
Publisher
Institute of Electrical and Electronics Engineers
City or Country
New Jersey
Citation
ZHOU, Xin; HAN, DongGyun; and LO, David.
Simple or complex? Together for a more accurate just-in-time defect predictor. (2022). Proceedings of the 30th International Conference on Program Comprehension, Virtual Event, 2022 May 16-17. 229-240.
Available at: https://ink.library.smu.edu.sg/sis_research/7691
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3524610.3527910