Biblio
Today our world benefits from Internet of Things (IoT) technology; however, new security problems arise when these IoT devices are introduced into our homes. Because many of these IoT devices have access to the Internet and they have little to no security, they make our smart homes highly vulnerable to compromise. Some of the threats include IoT botnets and generic confidentiality, integrity, and availability (CIA) attacks. Our research explores botnet detection by experimenting with supervised machine learning and deep-learning classifiers. Further, our approach assesses classifier performance on unbalanced datasets that contain benign data, mixed in with small amounts of malicious data. We demonstrate that the classifiers can separate malicious activity from benign activity within a small IoT network dataset. The classifiers can also separate malicious activity from benign activity in increasingly larger datasets. Our experiments have demonstrated incremental improvement in results for (1) accuracy, (2) probability of detection, and (3) probability of false alarm. The best performance results include 99.9% accuracy, 99.8% probability of detection, and 0% probability of false alarm. This paper also demonstrates how the performance of these classifiers increases, as IoT training datasets become larger and larger.
Bitcoin, a peer-to-peer payment system and digital currency, is often involved in illicit activities such as scamming, ransomware attacks, illegal goods trading, and thievery. At the time of writing, the Bitcoin ecosystem has not yet been mapped and as such there is no estimate of the share of illicit activities. This paper provides the first estimation of the portion of cyber-criminal entities in the Bitcoin ecosystem. Our dataset consists of 854 observations categorised into 12 classes (out of which 5 are cybercrime-related) and a total of 100,000 uncategorised observations. The dataset was obtained from the data provider who applied three types of clustering of Bitcoin transactions to categorise entities: co-spend, intelligence-based, and behaviour-based. Thirteen supervised learning classifiers were then tested, of which four prevailed with a cross-validation accuracy of 77.38%, 76.47%, 78.46%, 80.76% respectively. From the top four classifiers, Bagging and Gradient Boosting classifiers were selected based on their weighted average and per class precision on the cybercrime-related categories. Both models were used to classify 100,000 uncategorised entities, showing that the share of cybercrime-related is 29.81% according to Bagging, and 10.95% according to Gradient Boosting with number of entities as the metric. With regard to the number of addresses and current coins held by this type of entities, the results are: 5.79% and 10.02% according to Bagging; and 3.16% and 1.45% according to Gradient Boosting.
Most anti-phishing solutions that exist today require scanning a large portion of the web, which is vast and equivalent to finding a needle in a haystack. In addition, such solutions are not very efficient. We propose a different approach. Our solution does not rely on the scanning of the entire Internet or a large portion of it and only needs access to the brand's traffic in order to be able to detect phishing attempts against that brand. By analyzing a sample of phishing websites, we find features that can be used to distinguish phishing websites from the legitimate ones. We then use these features to train a machine learning classifier capable of helping brands detect phishing attempts against them. Our approach can detect up to 86% of phishing attacks against the brands and is best used as a complementary tool to the existing anti-phishing solutions.
Botnet detection represents one of the most crucial prerequisites of successful botnet neutralization. This paper explores how accurate and timely detection can be achieved by using supervised machine learning as the tool of inferring about malicious botnet traffic. In order to do so, the paper introduces a novel flow-based detection system that relies on supervised machine learning for identifying botnet network traffic. For use in the system we consider eight highly regarded machine learning algorithms, indicating the best performing one. Furthermore, the paper evaluates how much traffic needs to be observed per flow in order to capture the patterns of malicious traffic. The proposed system has been tested through the series of experiments using traffic traces originating from two well-known P2P botnets and diverse non-malicious applications. The results of experiments indicate that the system is able to accurately and timely detect botnet traffic using purely flow-based traffic analysis and supervised machine learning. Additionally, the results show that in order to achieve accurate detection traffic flows need to be monitored for only a limited time period and number of packets per flow. This indicates a strong potential of using the proposed approach within a future on-line detection framework.