Visible to the public Classification of XSS Attacks by Machine Learning with Frequency of Appearance and Co-occurrence

TitleClassification of XSS Attacks by Machine Learning with Frequency of Appearance and Co-occurrence
Publication TypeConference Paper
Year of Publication2019
AuthorsAkaishi, Sota, Uda, Ryuya
Conference Name2019 53rd Annual Conference on Information Sciences and Systems (CISS)
Keywordsattack detection filter, Computer crime, cross site, Cross Site Scripting, dummy sites, fake HTML input form, HTTP cookies, Human Behavior, hypermedia markup languages, information collection, Internet, Kernel, learning (artificial intelligence), machine learning, machine learning algorithms, pattern classification, phishing, preprocessing method, pubcrawl, Radio frequency, Random Forest, Resiliency, Scalability, SCW, Support vector machines, SVM, Training, Uniform resource locators, vectorization, Word2Vec, XSS attack scripts
AbstractCross site scripting (XSS) attack is one of the attacks on the web. It brings session hijack with HTTP cookies, information collection with fake HTML input form and phishing with dummy sites. As a countermeasure of XSS attack, machine learning has attracted a lot of attention. There are existing researches in which SVM, Random Forest and SCW are used for the detection of the attack. However, in the researches, there are problems that the size of data set is too small or unbalanced, and that preprocessing method for vectorization of strings causes misclassification. The highest accuracy of the classification was 98% in existing researches. Therefore, in this paper, we improved the preprocessing method for vectorization by using word2vec to find the frequency of appearance and co-occurrence of the words in XSS attack scripts. Moreover, we also used a large data set to decrease the deviation of the data. Furthermore, we evaluated the classification results with two procedures. One is an inappropriate procedure which some researchers tend to select by mistake. The other is an appropriate procedure which can be applied to an attack detection filter in the real environment.
DOI10.1109/CISS.2019.8693047
Citation Keyakaishi_classification_2019