Title | Towards a Deep Learning Model for Vulnerability Detection on Web Application Variants |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Fidalgo, Ana, Medeiros, Ibéria, Antunes, Paulo, Neves, Nuno |
Conference Name | 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) |
Keywords | compositionality, Deep Learning, feature extraction, Human Behavior, machine learning, Metrics, natural language processing, pubcrawl, Resiliency, security, Software, software security, Structured Query Language, Task Analysis, vulnerability detection, Web application vulnerabilities |
Abstract | Reported vulnerabilities have grown significantly over the recent years, with SQL injection (SQLi) being one of the most prominent, especially in web applications. For these, such increase can be explained by the integration of multiple software parts (e.g., various plugins and modules), often developed by different organizations, composing thus web application variants. Machine Learning has the potential to be a great ally on finding vulnerabilities, aiding experts by reducing the search space or even by classifying programs on their own. However, previous work usually does not consider SQLi or utilizes techniques hard to scale. Moreover, there is a clear gap in vulnerability detection with machine learning for PHP, the most popular server-side language for web applications. This paper presents a Deep Learning model able to classify PHP slices as vulnerable (or not) to SQLi. As slices can belong to any variant, we propose the use of an intermediate language to represent the slices and interpret them as text, resorting to well-studied Natural Language Processing (NLP) techniques. Preliminary results of the use of the model show that it can discover SQLi, helping programmers and precluding attacks that would eventually cost a lot to repair. |
DOI | 10.1109/ICSTW50294.2020.00083 |
Citation Key | fidalgo_towards_2020 |