Biblio
Distributed Denial of Service (DDoS) attacks became a true threat to network infrastructure. DDoS attacks are capable of inflicting major disruption to the information communication technology infrastructure. DDoS attacks aim to paralyze networks by overloading servers, network links, and network devices with illegitimate traffic. Therefore, it is important to detect and mitigate DDoS attacks to reduce the impact of DDoS attacks. In traditional networks, the hardware and software to detect and mitigate DDoS attacks are expensive and difficult to deploy. Software-Defined Network (SDN) is a new paradigm in network architecture by separating the control plane and data plane, thereby increasing scalability, flexibility, control, and network management. Therefore, SDN can dynamically change DDoS traffic forwarding rules and improve network security. In this study, a DDoS attack detection and mitigation system was built on the SDN architecture using the random forest machine-learning algorithm. The random forest algorithm will classify normal and attack packets based on flow entries. If packets are classified as a DDoS attack, it will be mitigated by adding flow rules to the switch. Based on tests that have been done, the detection system can detect DDoS attacks with an average accuracy of 98.38% and an average detection time of 36 ms. Then the mitigation system can mitigate DDoS attacks with an average mitigation time of 1179 ms and can reduce the average number of attack packets that enter the victim host by 15672 packets and can reduce the average number of CPU usage on the controller by 44,9%.
Short Message Service is now-days the most used way of communication in the electronic world. While many researches exist on the email spam detection, we haven't had the insight knowledge about the spam done within the SMS's. This might be because the frequency of spam in these short messages is quite low than the emails. This paper presents different ways of analyzing spam for SMS and a new pre-processing way to get the actual dataset of spam messages. This dataset was then used on different algorithm techniques to find the best working algorithm in terms of both accuracy and recall. Random Forest algorithm was then implemented in a real world application library written in C\# for cross platform .Net development. This library is capable of using a prebuild model for classifying a new dataset for spam and ham.
Trying to solve the risk of data privacy disclosure in classification process, a Random Forest algorithm under differential privacy named DPRF-gini is proposed in the paper. In the process of building decision tree, the algorithm first disturbed the process of feature selection and attribute partition by using exponential mechanism, and then meet the requirement of differential privacy by adding Laplace noise to the leaf node. Compared with the original algorithm, Empirical results show that protection of data privacy is further enhanced while the accuracy of the algorithm is slightly reduced.
Currently, mobile botnet attacks have shifted from computers to smartphones due to its functionality, ease to exploit, and based on financial intention. Mostly, it attacks Android due to its popularity and high usage among end users. Every day, more and more malicious mobile applications (apps) with the botnet capability have been developed to exploit end users' smartphones. Therefore, this paper presents a new mobile botnet classification based on permission and Application Programming Interface (API) calls in the smartphone. This classification is developed using static analysis in a controlled lab environment and the Drebin dataset is used as the training dataset. 800 apps from the Google Play Store have been chosen randomly to test the proposed classification. As a result, 16 permissions and 31 API calls that are most related with mobile botnet have been extracted using feature selection and later classified and tested using machine learning algorithms. The experimental result shows that the Random Forest Algorithm has achieved the highest detection accuracy of 99.4% with the lowest false positive rate of 16.1% as compared to other machine learning algorithms. This new classification can be used as the input for mobile botnet detection for future work, especially for financial matters.
Nowadays, cyber attacks affect many institutions and individuals, and they result in a serious financial loss for them. Phishing Attack is one of the most common types of cyber attacks which is aimed at exploiting people's weaknesses to obtain confidential information about them. This type of cyber attack threats almost all internet users and institutions. To reduce the financial loss caused by this type of attacks, there is a need for awareness of the users as well as applications with the ability to detect them. In the last quarter of 2016, Turkey appears to be second behind China with an impact rate of approximately 43% in the Phishing Attack Analysis report between 45 countries. In this study, firstly, the characteristics of this type of attack are explained, and then a machine learning based system is proposed to detect them. In the proposed system, some features were extracted by using Natural Language Processing (NLP) techniques. The system was implemented by examining URLs used in Phishing Attacks before opening them with using some extracted features. Many tests have been applied to the created system, and it is seen that the best algorithm among the tested ones is the Random Forest algorithm with a success rate of 89.9%.
Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.