Research Collection School Of Computing and Information Systems

Large language models for software engineering: A systematic literature review

Xinyi HOU
Yanjie ZHAO
Yue LIU
Zhou YANG, Singapore Management UniversityFollow
Kailong WANG
Li LI
Xiapu LUO
David LO, Singapore Management UniversityFollow
John GRUNDY
Haoyu WANG

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

11-2024

Abstract

Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages. To bridge this gap, we conducted a Systematic Literature Review (SLR) on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes. We selected and analyzed 395 research articles from January 2017 to January 2024 to answer four key Research Questions (RQs). In RQ1, we categorize different LLMs that have been employed in SE tasks, characterizing their distinctive features and uses. In RQ2, we analyze the methods used in data collection, pre-processing, and application, highlighting the role of well-curated datasets for successful LLM for SE implementation. RQ3 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE. Finally, RQ4 examines the specific SE tasks where LLMs have shown success to date, illustrating their practical contributions to the field. From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and highlighting promising areas for future study. Our artifacts are publicly available at https://github.com/security-pride/LLM4SE_SLR.

Keywords

Software Engineering, Large Language Model, Survey

Discipline

Artificial Intelligence and Robotics | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ACM Transactions on Software Engineering and Methodology

Volume

Issue

First Page

Last Page

ISSN

1049-331X

Identifier

10.1145/3695988

Publisher

Association for Computing Machinery (ACM)

Citation

HOU, Xinyi; ZHAO, Yanjie; LIU, Yue; YANG, Zhou; WANG, Kailong; LI, Li; LUO, Xiapu; David LO; GRUNDY, John; and WANG, Haoyu. Large language models for software engineering: A systematic literature review. (2024). ACM Transactions on Software Engineering and Methodology. 33, (8), 1-79.
Available at: https://ink.library.smu.edu.sg/sis_research/10223

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/36959

Download

Included in

Artificial Intelligence and Robotics Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Large language models for software engineering: A systematic literature review

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Large language models for software engineering: A systematic literature review

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links