Publication Type
Journal Article
Version
publishedVersion
Publication Date
4-2013
Abstract
ContextSQL injection (SQLI) and cross site scripting (XSS) are the two most common and serious web application vulnerabilities for the past decade. To mitigate these two security threats, many vulnerability detection approaches based on static and dynamic taint analysis techniques have been proposed. Alternatively, there are also vulnerability prediction approaches based on machine learning techniques, which showed that static code attributes such as code complexity measures are cheap and useful predictors. However, current prediction approaches target general vulnerabilities. And most of these approaches locate vulnerable code only at software component or file levels. Some approaches also involve process attributes that are often difficult to measure.ObjectiveThis paper aims to provide an alternative or complementary solution to existing taint analyzers by proposing static code attributes that can be used to predict specific program statements, rather than software components, which are likely to be vulnerable to SQLI or XSS.MethodFrom the observations of input sanitization code that are commonly implemented in web applications to avoid SQLI and XSS vulnerabilities, in this paper, we propose a set of static code attributes that characterize such code patterns. We then build vulnerability prediction models from the historical information that reflect proposed static attributes and known vulnerability data to predict SQLI and XSS vulnerabilities.ResultsWe developed a prototype tool called PhpMinerI for data collection and used it to evaluate our models on eight open source web applications. Our best model achieved an averaged result of 93% recall and 11% false alarm rate in predicting SQLI vulnerabilities, and 78% recall and 6% false alarm rate in predicting XSS vulnerabilities.ConclusionThe experiment results show that our proposed vulnerability predictors are useful and effective at predicting SQLI and XSS vulnerabilities.
Keywords
Vulnerability prediction, Data mining, Web application vulnerability, Input sanitization, Static code attributes, Empirical study
Discipline
Data Storage Systems | Software Engineering
Research Areas
Cybersecurity
Publication
Information and Software Technology
Volume
55
Issue
10
First Page
1767
Last Page
1780
ISSN
0950-5849
Identifier
10.1016/j.infsof.2013.04.002
Publisher
Elsevier
Citation
SHAR, Lwin Khin and TAN, Hee Beng Kuan.
Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns. (2013). Information and Software Technology. 55, (10), 1767-1780.
Available at: https://ink.library.smu.edu.sg/sis_research/4896
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1016/j.infsof.2013.04.002