Research Collection School Of Computing and Information Systems

Defects4C: Benchmarking large language model repair capability with C/C++ bugs

Jian WANG, Singapore Management UniversityFollow
Xiaofei XIE, Singapore Management UniversityFollow
Qiang HU
Shangqing LIU
Jiongchi YU, Singapore Management UniversityFollow
Jiaolong KONG, Singapore Management UniversityFollow
Yi LI

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

11-2025

Abstract

Automated Program Repair (APR) plays a critical role in enhancing the quality and reliability of software systems. While substantial progress has been made in Java-based APR, largely facilitated by benchmarks like Defects4J, there remains a significant gap in research on C/C++ program repair, despite the widespread use of C/C++ and the prevalence of associated vulnerabilities. This gap is primarily due to the lack of high-quality, open-source benchmarks tailored for C/C++. To address this issue, we introduce Defects4C, a comprehensive and executable benchmark specifically designed for C/C++ program repair. Our dataset is constructed from real-world C/C++ repositories and includes a large collection of bug-relevant commits (9M in total), 248 high-quality buggy functions, and 102 vulnerable functions, all paired with test cases for reproduction. These resources enable rigorous evaluation of repair techniques and support the retraining of learning-based approaches for enhanced performance. Using Defects4C, we conduct a comprehensive empirical study evaluating the effectiveness of 24 state-of-the-art large language models (LLMs) in repairing C/C++ faults. Our findings offer valuable insights into the strengths and limitations of current LLM-based APR techniques in this domain, highlighting both the need for more robust methods and the critical role of Defects4C in advancing future research.

Discipline

Artificial Intelligence and Robotics

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

Proceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering, Seoul, Korea, November 16-20

First Page

Last Page

City or Country

Korea

Citation

WANG, Jian; XIE, Xiaofei; HU, Qiang; LIU, Shangqing; YU, Jiongchi; KONG, Jiaolong; and LI, Yi. Defects4C: Benchmarking large language model repair capability with C/C++ bugs. (2025). Proceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering, Seoul, Korea, November 16-20. 1-12.
Available at: https://ink.library.smu.edu.sg/sis_research/10635

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Research Collection School Of Computing and Information Systems

Defects4C: Benchmarking large language model repair capability with C/C++ bugs

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Defects4C: Benchmarking large language model repair capability with C/C++ bugs

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links