Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2023
Abstract
It is common practice for OSS users to leverage and monitor security advisories to discover newly disclosed OSS vulnerabilities and their corresponding patches for vulnerability remediation. It is common for vulnerability fixes to be publicly available one week earlier than their disclosure. This gap in time provides an opportunity for attackers to exploit the vulnerability. Hence, OSS users need to sense the fix as early as possible so that the vulnerability can be remediated before it is exploited. However, it is common for OSS to adopt a vulnerability disclosure policy which causes the majority of vulnerabilities to be fixed silently, meaning the commit with the fix does not indicate any vulnerability information. In this case even if a fix is identified, it is hard for OSS users to understand the vulnerability and evaluate its potential impact. To improve early sensing of vulnerabilities, the identification of silent fixes and their corresponding explanations (e.g., the corresponding common weakness enumeration (CWE) and exploitability rating) are equally important. However, it is challenging to identify silent fixes and provide explanations due to the limited and diverse data. To tackle this challenge, we propose CoLeFunDa: a framework consisting of a Contrastive Learner and FunDa, which is a novel approach for Function change Data augmentation. FunDa first increases the fix data (i.e., code changes) at the function level with unsupervised and supervised strategies. Then the contrastive learner leverages contrastive learning to effectively train a function change encoder, FCBERT, from diverse fix data. Finally, we leverage FCBERT to further fine-tune three downstream tasks, i.e., silent fix identification, CWE category classification, and exploitability rating classification, respectively. Our result shows that CoLeFunDa outperforms all the state-of-art baselines in all downstream tasks. We also conduct a survey to verify the effectiveness of CoLeFunDa in practical usage. The result shows that CoLeFunDa can categorize 62.5% (25 out of 40) CVEs with correct CWE categories within the top 2 recommendations
Keywords
Contrastive learning, Data augmentation, Disclosure policies, Down-stream, OSS vulnerability, Potential impacts, Security advisories, User need, Vulnerability disclosure, Vulnerability remediations
Discipline
Databases and Information Systems | Information Security
Research Areas
Data Science and Engineering
Publication
Proceedings of the International Conference on Software Engineering, Melbourne, May 15-16
First Page
2565
Last Page
2577
ISBN
9781665457019
Identifier
10.1109/ICSE48619.2023.00214
Publisher
IEEE
City or Country
New Jersey, USA
Citation
ZHOU, Jiayuan; PACHECO, Michael; CHEN, Jinfu; HU, Xing; XIA, Xin; LO, David; and HASSAN, Ahmed E..
CoLeFunDa: Explainable silent vulnerability fix identification. (2023). Proceedings of the International Conference on Software Engineering, Melbourne, May 15-16. 2565-2577.
Available at: https://ink.library.smu.edu.sg/sis_research/8513
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.