Visible to the public Detecting phishing attacks from URL by using NLP techniques

TitleDetecting phishing attacks from URL by using NLP techniques
Publication TypeConference Paper
Year of Publication2017
AuthorsBuber, E., Dırı, B., Sahingoz, O. K.
Conference Name2017 International Conference on Computer Science and Engineering (UBMK)
ISBN Number978-1-5386-0930-9
KeywordsComputer crime, Cyber Attack Detection, cyber attack threats, cyber security, Human Behavior, Internet, Internet users, Law, learning (artificial intelligence), machine learning, machine learning-based system, Markov processes, Nanoelectromechanical systems, natural language processing, natural language processing techniques, NLP, phishing attack, phishing attack analysis report, Postal services, pubcrawl, random forest algorithm, Resiliency, Scalability, security of data, Uniform resource locators, unsolicited e-mail, URL
Abstract

Nowadays, cyber attacks affect many institutions and individuals, and they result in a serious financial loss for them. Phishing Attack is one of the most common types of cyber attacks which is aimed at exploiting people's weaknesses to obtain confidential information about them. This type of cyber attack threats almost all internet users and institutions. To reduce the financial loss caused by this type of attacks, there is a need for awareness of the users as well as applications with the ability to detect them. In the last quarter of 2016, Turkey appears to be second behind China with an impact rate of approximately 43% in the Phishing Attack Analysis report between 45 countries. In this study, firstly, the characteristics of this type of attack are explained, and then a machine learning based system is proposed to detect them. In the proposed system, some features were extracted by using Natural Language Processing (NLP) techniques. The system was implemented by examining URLs used in Phishing Attacks before opening them with using some extracted features. Many tests have been applied to the created system, and it is seen that the best algorithm among the tested ones is the Random Forest algorithm with a success rate of 89.9%.

URLhttp://ieeexplore.ieee.org/document/8093406/
DOI10.1109/UBMK.2017.8093406
Citation Keybuber_detecting_2017