Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
5-2023
Abstract
Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company.
Keywords
C++ programming, C/C++ programming language, Code changes, Defect prediction, Just-in-time, Language independents, Open-source, Quality of software, Software project, Supervised methods
Discipline
Programming Languages and Compilers | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR): Melbourne, May 15-16: Proceedings
First Page
472
Last Page
484
ISBN
9798350311846
Identifier
10.1109/MSR59073.2023.00072
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
NI, Chao; XU, Xiaodan; YANG, Kaiwen; and LO, David.
Boosting just-in-time defect prediction with specific features of C/C++ programming languages in code changes. (2023). 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR): Melbourne, May 15-16: Proceedings. 472-484.
Available at: https://ink.library.smu.edu.sg/sis_research/8623
Copyright Owner and License
Authors
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/MSR59073.2023.00072