Title | URL Phishing Detection using Machine Learning Techniques based on URLs Lexical Analysis |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Abutaha, Mohammed, Ababneh, Mohammad, Mahmoud, Khaled, Baddar, Sherenaz Al-Haj |
Conference Name | 2021 12th International Conference on Information and Communication Systems (ICICS) |
Keywords | Collaboration, composability, Forestry, GBC, machine learning, middleware security, Neural Network, Neural networks, Organizations, phishing, Phishing Detection, policy-based governance, pubcrawl, Random Forest, Robustness, Support vector machines, SVM, Uniform resource locators, URL analysis |
Abstract | Phishing URLs mainly target individuals and/or organizations through social engineering attacks by exploiting the humans' weaknesses in information security awareness. These URLs lure online users to access fake websites, and harvest their confidential information, such as debit/credit card numbers and other sensitive information. In this work, we introduce a phishing detection technique based on URL lexical analysis and machine learning classifiers. The experiments were carried out on a dataset that originally contained 1056937 labeled URLs (phishing and legitimate). This dataset was processed to generate 22 different features that were reduced further to a smaller set using different features reduction techniques. Random Forest, Gradient Boosting, Neural Network and Support Vector Machine (SVM) classifiers were all evaluated, and results show the superiority of SVMs, which achieved the highest accuracy in detecting the analyzed URLs with a rate of 99.89%. Our approach can be incorporated within add-on/middleware features in Internet browsers for alerting online users whenever they try to access a phishing website using only its URL. |
DOI | 10.1109/ICICS52457.2021.9464539 |
Citation Key | abutaha_url_2021 |