Combining Software Metrics and Text Features for Vulnerable File Prediction
Publication Type
Conference Proceeding Article
Publication Date
12-2015
Abstract
In recent years, to help developers reduce time and effort required to build highly secure software, a number of prediction models which are built on different kinds of features have been proposed to identify vulnerable source code files. In this paper, we propose a novel approach VULPREDICTOR to predict vulnerable files, it analyzes software metrics and text mining together to build a composite prediction model. VULPREDICTOR first builds 6 underlying classifiers on a training set of vulnerable and non-vulnerable files represented by their software metrics and text features, and then constructs a meta classifier to process the outputs of the 6 underlying classifiers. We evaluate our solution on datasets from three web applications including Drupal, PHPMyAdmin and Moodle which contain a total of 3,466 files and 223 vulnerabilities. The experiment results show that VULPREDICTOR can achieve F1 and EffectivenessRatio@20% scores of up to 0.683 and 75%, respectively. On average across the 3 projects, VULPREDICTOR improves the F1 and EffectivenessRatio@20% scores of the best performing state-of-the-art approaches proposed by Walden et al. by 46.53% and 14.93%, respectively.
Keywords
Machine Learning, Text Mining, Vulnerable File
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
20th International Conference on Engineering of Complex Computer Systems (ICECCS 2015)
First Page
40
Last Page
49
ISBN
9781467385817
Identifier
10.1109/ICECCS.2015.15
Publisher
IEEE
City or Country
Gold Coast, Australia
Citation
ZHANG, Yun; David LO; XIA, Xin; XU, Bowen; SUN, Jianling Sun; and LI, Shanping.
Combining Software Metrics and Text Features for Vulnerable File Prediction. (2015). 20th International Conference on Engineering of Complex Computer Systems (ICECCS 2015). 40-49.
Available at: https://ink.library.smu.edu.sg/sis_research/3097
Additional URL
http://dx.doi.org/10.1109/ICECCS.2015.15