Research Collection School Of Computing and Information Systems

Isolating compiler bugs by generating effective witness programs with Large Language Models

Haoxin TU, Singapore Management UniversityFollow
Zhide ZHOU
He JIANG
IMAM NUR BANI YUSUF, Singapore Management UniversityFollow
Yuxian LI
Lingxiao JIANG, Singapore Management UniversityFollow

Publication Type

Journal Article

Publication Date

5-2024

Abstract

Compiler bugs pose a significant threat to safety-critical applications, and promptly as well as effectively isolating these bugs is crucial for assuring the quality of compilers. However, the limited availability of debugging information on reported bugs complicates the compiler bug isolation task. Existing compiler bug isolation approaches convert the problem into a test program mutation problem, but they are still limited by ineffective mutation strategies or high human effort requirements. Drawing inspiration from the recent progress of pre-trained Large Language Models (LLMs), such as ChatGPT, in code generation, we propose a new approach named LLM4CBI to utilize LLMs to generate effective test programs for compiler bug isolation. However, using LLMs directly for test program mutation may not yield the desired results due to the challenges associated with formulating precise prompts and selecting specialized prompts. To overcome the challenges, three new components are designed in LLM4CBI. First, LLM4CBI utilizes a program complexity-guided prompt production component, which leverages data and control flow analysis to identify the most valuable variables and locations in programs for mutation. Second, LLM4CBI employs a memorized prompt selection component, which adopts reinforcement learning to select specialized prompts for mutating test programs continuously. Third, a test program validation component is proposed to select specialized feedback prompts to avoid repeating the same mistakes during the mutation process. Compared with the state-of-the-art approaches (DiWi and RecBi) over 120 real bugs from the two most popular compilers, namely GCC and LLVM, our evaluation demonstrates the advantages of LLM4CBI: It can isolate 69.70%/21.74% and 24.44%/8.92% more bugs than DiWi and RecBi within Top-1/Top-5 ranked results. Additionally, we demonstrate that the LLMs component (i.e., GPT-3.5) used in LLM4CBI can be easily replaced by other LLMs while still achieving reasonable results in comparison to related studies.

Keywords

Software debugging, Bug isolation, Compilers, GCC, LLVM, Reinforcement learning, Large language models (LLMs).

Discipline

Artificial Intelligence and Robotics | Computer Sciences

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Software Engineering

Volume

Issue

First Page

1768

Last Page

1788

ISSN

0098-5589

Identifier

10.1109/TSE.2024.3397822

Publisher

Institute of Electrical and Electronics Engineers

Citation

TU, Haoxin; ZHOU, Zhide; JIANG, He; IMAM NUR BANI YUSUF; LI, Yuxian; and JIANG, Lingxiao. Isolating compiler bugs by generating effective witness programs with Large Language Models. (2024). IEEE Transactions on Software Engineering. 50, (7), 1768-1788.
Available at: https://ink.library.smu.edu.sg/sis_research/9985

Comments

PDF provided by faculty

Additional URL

https://doi.org/10.1109/TSE.2024.3397822

This document is currently not available here.

COinS

Research Collection School Of Computing and Information Systems

Isolating compiler bugs by generating effective witness programs with Large Language Models

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Comments

Additional URL

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Isolating compiler bugs by generating effective witness programs with Large Language Models

Author

Publication Type

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Comments

Additional URL

Share

Search

Links

Browse

Links