Using Supervised Machine Learning Algorithms to Detect Suspicious URLs in Online Social Networks

Submitted by grigby1 on Fri, 10/26/2018 - 5:08pm

Title	Using Supervised Machine Learning Algorithms to Detect Suspicious URLs in Online Social Networks
Publication Type	Conference Paper
Year of Publication	2017
Authors	Al-Janabi, Mohammed, Quincey, Ed de, Andras, Peter
Conference Name	Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017
Publisher	ACM
Conference Location	New York, NY, USA
ISBN Number	978-1-4503-4993-2
Keywords	Human Behavior, malicious URLs, Metrics, phishing, pubcrawl, Random Forest, Scalability, spam, spam detection, Twitter
Abstract	The increasing volume of malicious content in social networks requires automated methods to detect and eliminate such content. This paper describes a supervised machine learning classification model that has been built to detect the distribution of malicious content in online social networks (ONSs). Multisource features have been used to detect social network posts that contain malicious Uniform Resource Locators (URLs). These URLs could direct users to websites that contain malicious content, drive-by download attacks, phishing, spam, and scams. For the data collection stage, the Twitter streaming application programming interface (API) was used and VirusTotal was used for labelling the dataset. A random forest classification model was used with a combination of features derived from a range of sources. The random forest model without any tuning and feature selection produced a recall value of 0.89. After further investigation and applying parameter tuning and feature selection methods, however, we were able to improve the classifier performance to 0.92 in recall.
URL	http://doi.acm.org/10.1145/3110025.3116201
DOI	10.1145/3110025.3116201
Citation Key	al-janabi_using_2017

Groups:

Science of Security VO