Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
6-2025
Abstract
The compiler bug duplication problem (where many test failures are caused by the same compiler bug) can lead to huge waste of time and resource in diagnosing test failures produced by compiler testing. It is particularly challenging with regard to the silent compiler bugs that do not produce any error messages. To address this problem, multiple white-box techniques were proposed, but they are inapplicable in many practical scenarios. Black-box techniques are more practical, but the existing ones are less effective as they often rely on irrelevant syntactic information. To bridge this gap, we propose a novel black-box technique (BLADE), which aims to improve the effectiveness of black-box de-duplication by extracting failure-relevant semantic information from failure-triggering test programs in a black-box manner. It first learns failure-relevant semantic information based on intermediate representation learning by employing the classification of failure-triggering and failure-free test programs as the auxiliary objective, and then extracts such information based on model interpretation. Our experiments on four widely-used datasets (collected from GCC and LLVM) show that BLADE significantly outperforms the two existing black-box techniques with an average improvement of 36% and 12% in identifying unique silent compiler bugs when analyzing the same number of test failures respectively, and achieves competitive effectiveness with the state-of-the-art white-box techniques.
Keywords
Compiler Testing, Bug Deduplication
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
Proceedings of the ACM on Software Engineering, Volume 2, Issue FSE, Trondheim, Norway, 2025 June 23-27
First Page
2359
Last Page
2381
Identifier
10.1145/3729375
City or Country
Canada
Citation
CHEN, Junjie; FAN, Xingyu; YANG, Chen; LIU, Shuang; and SUN, Jun.
De-duplicating silent compiler bugs via deep semantic representation. (2025). Proceedings of the ACM on Software Engineering, Volume 2, Issue FSE, Trondheim, Norway, 2025 June 23-27. 2359-2381.
Available at: https://ink.library.smu.edu.sg/sis_research/10289
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3729375