Title | Spear Phishing Emails Detection Based on Machine Learning |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Ding, Xiong, Liu, Baoxu, Jiang, Zhengwei, Wang, Qiuyun, Xin, Liling |
Conference Name | 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD) |
Date Published | may |
Keywords | Companies, Conferences, feature extraction, Forwarding features, Human Behavior, interpolation, KMSMOTE, machine learning, machine learning algorithms, phishing, pubcrawl, Reputation features, spear phishing emails |
Abstract | Spear phishing emails target to specific individual or organization, they are more elaborated, targeted, and harmful than phishing emails. The attackers usually harvest information about the recipient in any available ways, then create a carefully camouflaged email and lure the recipient to perform dangerous actions. In this paper we present a new effective approach to detect spear phishing emails based on machine learning. Firstly we extracted 21 Stylometric features from email, 3 forwarding features from Email Forwarding Relationship Graph Database(EFRGD), and 3 reputation features from two third-party threat intelligence platforms, Virus Total(VT) and Phish Tank(PT). Then we made an improvement on Synthetic Minority Oversampling Technique(SMOTE) algorithm named KM-SMOTE to reduce the impact of unbalanced data. Finally we applied 4 machine learning algorithms to distinguish spear phishing emails from non-spear phishing emails. Our dataset consists of 417 spear phishing emails and 13916 non-spear phishing emails. We were able to achieve a maximum recall of 95.56%, precision of 98.85% and 97.16% of F1-score with the help of forwarding features, reputation features and KM-SMOTE algorithm. |
DOI | 10.1109/CSCWD49262.2021.9437758 |
Citation Key | ding_spear_2021 |