Classification of XSS Attacks by Machine Learning with Frequency of Appearance and Co-occurrence

Submitted by aekwall on Mon, 09/28/2020 - 11:35am

Title	Classification of XSS Attacks by Machine Learning with Frequency of Appearance and Co-occurrence
Publication Type	Conference Paper
Year of Publication	2019
Authors	Akaishi, Sota, Uda, Ryuya
Conference Name	2019 53rd Annual Conference on Information Sciences and Systems (CISS)
Keywords	attack detection filter, Computer crime, cross site, Cross Site Scripting, dummy sites, fake HTML input form, HTTP cookies, Human Behavior, hypermedia markup languages, information collection, Internet, Kernel, learning (artificial intelligence), machine learning, machine learning algorithms, pattern classification, phishing, preprocessing method, pubcrawl, Radio frequency, Random Forest, Resiliency, Scalability, SCW, Support vector machines, SVM, Training, Uniform resource locators, vectorization, Word2Vec, XSS attack scripts
Abstract	Cross site scripting (XSS) attack is one of the attacks on the web. It brings session hijack with HTTP cookies, information collection with fake HTML input form and phishing with dummy sites. As a countermeasure of XSS attack, machine learning has attracted a lot of attention. There are existing researches in which SVM, Random Forest and SCW are used for the detection of the attack. However, in the researches, there are problems that the size of data set is too small or unbalanced, and that preprocessing method for vectorization of strings causes misclassification. The highest accuracy of the classification was 98% in existing researches. Therefore, in this paper, we improved the preprocessing method for vectorization by using word2vec to find the frequency of appearance and co-occurrence of the words in XSS attack scripts. Moreover, we also used a large data set to decrease the deviation of the data. Furthermore, we evaluated the classification results with two procedures. One is an inappropriate procedure which some researchers tend to select by mistake. The other is an appropriate procedure which can be applied to an attack detection filter in the real environment.
DOI	10.1109/CISS.2019.8693047
Citation Key	akaishi_classification_2019

Groups:

Science of Security VO