Publication Type
Journal Article
Version
acceptedVersion
Publication Date
8-2023
Abstract
With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes). Unique from prior works, MiDas constructs different neural networks for each level of code change granularity, corresponding to commit-level, file-level, hunk-level, and line-level, following their natural organization and then use an ensemble model combining all base models to output the final prediction. This design allows MiDas to better cope with the noisy and highly-imbalanced nature of vulnerability-fixing commit data. In addition, to reduce the human effort required to inspect code changes, we have designed an effort-aware adjustment for MiDas's outputs based on commit length. The evaluation result demonstrates that MiDas outperforms the current state-of-the-art baseline on both Java and Python-based datasets in terms of AUC by 4.9% and 13.7%, respectively. Furthermore, in terms of two effort-aware metrics, i.e., EffortCost@L and Popt@L, MiDas also performs better than the state-of-the-art baseline up to 28.2% and 15.9% on Java, 60% and 51.4% on Python, respectively.
Keywords
Vulnerability-fixing commit classification, Machine learning, Deep learning, Software security
Discipline
Artificial Intelligence and Robotics | Information Security
Research Areas
Cybersecurity; Intelligent Systems and Optimization; Software and Cyber-Physical Systems
Publication
IEEE Transactions on Software Engineering
Volume
49
Issue
8
First Page
4035
Last Page
4057
ISSN
0098-5589
Identifier
10.1109/TSE.2023.3281275
Publisher
Institute of Electrical and Electronics Engineers
Citation
NGUYEN, Truong Giang; CONG; Thanh Le; KANG, Hong Jin; WIDYASARI, Ratnadira; YANG, Chengran; ZHAO, Zhipeng; XU, Bowen; ZHOU, Jiayuan; XIA, Xin; HASSAN, Ahmed E.; David LO; and LO, David.
Multi-Granularity Detector for Vulnerability Fixes. (2023). IEEE Transactions on Software Engineering. 49, (8), 4035-4057.
Available at: https://ink.library.smu.edu.sg/sis_research/8508
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2023.3281275