Publication Type
Journal Article
Version
acceptedVersion
Publication Date
8-2025
Abstract
Open-source software (OSS) has experienced a surge in popularity, attributed to its collaborative development model and cost-effective nature. However, the adoption of specific software versions in development projects may introduce security risks when these versions bring along vulnerabilities. Current methods of identifying vulnerable versions typically analyze and extract the code features involved in vulnerability patches using static analysis with pre-defined rules. They then use code clone detection to identify the vulnerable versions. These methods are hindered by imprecision due to (1) the exclusion of vulnerability- irrelevant code in the analysis and (2) the inadequacy of code clone detection. This paper presents VERCATION, an approach designed to identify vulnerable versions of OSS written in C/C++. VERCATION combines program slicing with a Large Language Model (LLM) to identify vulnerability-relevant code from vulnerability patches. It then backtracks historical commits to gather previous modifications of identified vulnerability-relevant code. We propose code clone detection based on expanded and normal- ized ASTs to compare the differences between pre-modification and post-modification code, thereby locating the vulnerability- introducing commit (vic) and enabling the identification of the vulnerable versions between the vulnerability-fixing commit and the vic. We curate a dataset linking 122 OSS vulnerabilities and 1,211 versions to evaluate VERCATION. On this dataset, our approach achieves an F1 score of 93.1%, outperforming current state-of-the-art methods. More importantly, VERCATION detected 202 incorrect vulnerable OSS versions in NVD reports.
Keywords
Open-source software security, Vulnerable version, Large Language Model
Discipline
Information Security
Research Areas
Cybersecurity
Areas of Excellence
Digital transformation
Publication
IEEE Transactions on Software Engineering
First Page
1
Last Page
19
ISSN
0098-5589
Identifier
10.1109/TSE.2025.3599581
Publisher
Institute of Electrical and Electronics Engineers
Citation
CHENG, Yiran; ZHANG, Ting; SHAR, Lwin Khin; YANG, Shouguo; DONG, Chaopeng; LO, David; Lv, Shichao; SHI, Zhiqiang; and SUN, Limin.
VERCATION: Precise vulnerable open-source software version identification based on static analysis and LLM. (2025). IEEE Transactions on Software Engineering. 1-19.
Available at: https://ink.library.smu.edu.sg/sis_research/10481
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2025.3599581