A comprehensive study of OOP-related bugs in C++ compilers

Publication Type

Journal Article

Publication Date

6-2025

Abstract

Modern C++, a programming language characterized by its extensive use of object-oriented programming (OOP) features, is widely used for system programming. However, C++ compilers often struggle to correctly handle these sophisticated OOP features, resulting in numerous high-profile compiler bugs that can lead to crashes or miscompilation. Despite the significance of OOP-related bugs, existing studies largely overlook OOP features, hindering their ability to discover such bugs. To assist both compiler fuzzer designers and compiler developers, we conduct a comprehensive study of the compiler bugs caused by incorrectly handling C++ OOP-related features. First, we systematically extract 788 OOP-related C++ compiler bugs from GCC and LLVM. Second, derived from the core concepts of OOP and C++, we manually identified a two-level taxonomy of the OOP-related features leading to compiler bugs, which consists of 6 primary categories (e.g., Abstraction & Encapsulation, Inheritance, and Runtime Polymorphism), along with 17 secondary categories (e.g., Constructors & Destructors and Multiple Inheritance). Third, we systematically analyze the root causes, symptoms, fixes, options, and C++ standard versions of these bugs. Our analysis yields 13 key findings, highlighting that features related to the construction and destruction of objects lead to the highest number of bugs, crashes are the most frequent symptom, and while the average time from bug introduction to discovery is 1856 days, fixing the bug once discovered takes only 174 days on average. Additionally, more than half of the bugs can be triggered without any compiler options. These findings offer valuable insights not only for developing new compiler testing approaches but also for improving language design and compiler engineering. Inspired by these findings, we developed a proof-of-concept compiler fuzzer OOPFuzz, specifically targeting OOP-related bugs in C++ compilers. We applied it against the newest release versions of GCC and LLVM. In about 3 hours, it detected 9 bugs, of which 3 have been confirmed by the developers, including a bug of LLVM that had persisted for 13 years. The results indicate our taxonomy and analysis provide valuable insights for future research in compiler testing.

Keywords

Computer Bugs, C Languages, Testing, Taxonomy, Codes, Encapsulation, Sun, Standards, Optimization, Guidelines, C OOP Features, Compiler Testing, Taxonomy, Empirical Study, C Compiler, Study Of Bugs, Programming Language, Second Category, Frequency Of Symptoms, Fuzzy Set, Primary Categories, Complex Mechanisms, Common Symptoms, Key Concepts, General Strategy, Access Control, Number Of Options, Debugging, Base Classes, Code Generation, Analysis Of Symptoms, Error Handling, Corner Cases, Bug Fixes, Bug Reports, Private Members, Version Control System

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Software Engineering

Volume

51

Issue

6

First Page

1762

Last Page

1782

ISSN

0098-5589

Identifier

10.1109/TSE.2025.3566490

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TSE.2025.3566490

This document is currently not available here.

Share

COinS