Detecting Malicious Web Requests Using an Enhanced TextCNN

Submitted by aekwall on Mon, 12/14/2020 - 11:31am

Title	Detecting Malicious Web Requests Using an Enhanced TextCNN
Publication Type	Conference Paper
Year of Publication	2020
Authors	Yu, L., Chen, L., Dong, J., Li, M., Liu, L., Zhao, B., Zhang, C.
Conference Name	2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)
Date Published	jul
Keywords	composability, convolutional neural nets, convolutional neural network for text classification, Convolutional Neural Network for Text Classification (TextCNN), convolutional neural networks, Data models, Deep Learning, deep learning models, feature extraction, file servers, HTTP Dataset CSIC 2010, Internet, learning (artificial intelligence), machine learning, malicious Web request detection, Malicious Web requests detection, pattern classification, Predictive Metrics, pubcrawl, Resiliency, security of data, Semantics, statistical analysis, support vector machine, support vector machine (SVM), Support vector machines, text analysis, text classification, TextCNN, Transferable statistical features, Uniform resource locators, Web attack detection, web security, Web servers
Abstract	This paper proposes an approach that combines a deep learning-based method and a traditional machine learning-based method to efficiently detect malicious requests Web servers received. The first few layers of Convolutional Neural Network for Text Classification (TextCNN) are used to automatically extract powerful semantic features and in the meantime transferable statistical features are defined to boost the detection ability, specifically Web request parameter tampering. The semantic features from TextCNN and transferable statistical features from artificially-designing are grouped together to be fed into Support Vector Machine (SVM), replacing the last layer of TextCNN for classification. To facilitate the understanding of abstract features in form of numerical data in vectors extracted by TextCNN, this paper designs trace-back functions that map max-pooling outputs back to words in Web requests. After investigating the current available datasets for Web attack detection, HTTP Dataset CSIC 2010 is selected to test and verify the proposed approach. Compared with other deep learning models, the experimental results demonstrate that the approach proposed in this paper is competitive with the state-of-the-art.
DOI	10.1109/COMPSAC48688.2020.0-167
Citation Key	yu_detecting_2020

Groups:

Science of Security VO