Publication Type

Journal Article

Version

publishedVersion

Publication Date

4-2013

Abstract

ContextSQL injection (SQLI) and cross site scripting (XSS) are the two most common and serious web application vulnerabilities for the past decade. To mitigate these two security threats, many vulnerability detection approaches based on static and dynamic taint analysis techniques have been proposed. Alternatively, there are also vulnerability prediction approaches based on machine learning techniques, which showed that static code attributes such as code complexity measures are cheap and useful predictors. However, current prediction approaches target general vulnerabilities. And most of these approaches locate vulnerable code only at software component or file levels. Some approaches also involve process attributes that are often difficult to measure.ObjectiveThis paper aims to provide an alternative or complementary solution to existing taint analyzers by proposing static code attributes that can be used to predict specific program statements, rather than software components, which are likely to be vulnerable to SQLI or XSS.MethodFrom the observations of input sanitization code that are commonly implemented in web applications to avoid SQLI and XSS vulnerabilities, in this paper, we propose a set of static code attributes that characterize such code patterns. We then build vulnerability prediction models from the historical information that reflect proposed static attributes and known vulnerability data to predict SQLI and XSS vulnerabilities.ResultsWe developed a prototype tool called PhpMinerI for data collection and used it to evaluate our models on eight open source web applications. Our best model achieved an averaged result of 93% recall and 11% false alarm rate in predicting SQLI vulnerabilities, and 78% recall and 6% false alarm rate in predicting XSS vulnerabilities.ConclusionThe experiment results show that our proposed vulnerability predictors are useful and effective at predicting SQLI and XSS vulnerabilities.

Keywords

Vulnerability prediction, Data mining, Web application vulnerability, Input sanitization, Static code attributes, Empirical study

Discipline

Data Storage Systems | Software Engineering

Research Areas

Cybersecurity

Publication

Information and Software Technology

Volume

55

Issue

10

First Page

1767

Last Page

1780

ISSN

0950-5849

Identifier

10.1016/j.infsof.2013.04.002

Publisher

Elsevier

Additional URL

https://doi.org/10.1016/j.infsof.2013.04.002

Share

COinS