Biblio

List
Filter

Found 54 results

Filters: Keyword is Random Forest [Clear All Filters]

2023-09-20

Dixit, Utkarsh, Bhatia, Suman, Bhatia, Pramod. 2022. Comparison of Different Machine Learning Algorithms Based on Intrusion Detection System. 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON). 1:667—672.

An IDS is a system that helps in detecting any kind of doubtful activity on a computer network. It is capable of identifying suspicious activities at both the levels i.e. locally at the system level and in transit at the network level. Since, the system does not have its own dataset as a result it is inefficient in identifying unknown attacks. In order to overcome this inefficiency, we make use of ML. ML assists in analysing and categorizing attacks on diverse datasets. In this study, the efficacy of eight machine learning algorithms based on KDD CUP99 is assessed. Based on our implementation and analysis, amongst the eight Algorithms considered here, Support Vector Machine (SVM), Random Forest (RF) and Decision Tree (DT) have the highest testing accuracy of which got SVM does have the highest accuracy

2023-03-17

Hasnaeen, Shah Md Nehal, Chrysler, Andrew. 2022. Detection of Malware in UHF RFID User Memory Bank using Random Forest Classifier on Signal Strength Data in the Frequency Domain. 2022 IEEE International Conference on RFID (RFID). :47–52.

A method of detecting UHF RFID tags with SQL in-jection virus code written in its user memory bank is explored. A spectrum analyzer took signal strength readings in the frequency spectrum while an RFID reader was reading the tag. The strength of the signal transmitted by the RFID tag in the UHF range, more specifically within the 902–908 MHz sub-band, was used as data to train a Random Forest model for Malware detection. Feature reduction is accomplished by dividing the observed spectrum into 15 ranges with a bandwidth of 344 kHz each and detecting the number of maxima in each range. The malware-infested tag could be detected more than 80% of the time. The frequency ranges contributing most in this detection method were the low (903.451-903.795 MHz, 902.418-902.762 MHz) and high (907.238-907.582 MHz) bands in the observed spectrum.

ISSN: 2573-7635

2023-02-17

Alimi, Oyeniyi Akeem, Ouahada, Khmaies, Abu-Mahfouz, Adnan M., Rimer, Suvendi, Alimi, Kuburat Oyeranti Adefemi. 2022. Supervised learning based intrusion detection for SCADA systems. 2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON). :1–5.

Supervisory control and data acquisition (SCADA) systems play pivotal role in the operation of modern critical infrastructures (CIs). Technological advancements, innovations, economic trends, etc. have continued to improve SCADA systems effectiveness and overall CIs’ throughput. However, the trends have also continued to expose SCADA systems to security menaces. Intrusions and attacks on SCADA systems can cause service disruptions, equipment damage or/and even fatalities. The use of conventional intrusion detection models have shown trends of ineffectiveness due to the complexity and sophistication of modern day SCADA attacks and intrusions. Also, SCADA characteristics and requirement necessitate exceptional security considerations with regards to intrusive events’ mitigations. This paper explores the viability of supervised learning algorithms in detecting intrusions specific to SCADA systems and their communication protocols. Specifically, we examine four supervised learning algorithms: Random Forest, Naïve Bayes, J48 Decision Tree and Sequential Minimal Optimization-Support Vector Machines (SMO-SVM) for evaluating SCADA datasets. Two SCADA datasets were used for evaluating the performances of our approach. To improve the classification performances, feature selection using principal component analysis was used to preprocess the datasets. Using prominent classification metrics, the SVM-SMO presented the best overall results with regards to the two datasets. In summary, results showed that supervised learning algorithms were able to classify intrusions targeted against SCADA systems with satisfactory performances.

ISSN: 2377-2697

2023-02-03

Alkawaz, Mohammed Hazim, Joanne Steven, Stephanie, Mohammad, Omar Farook, Gapar Md Johar, Md. 2022. Identification and Analysis of Phishing Website based on Machine Learning Methods. 2022 IEEE 12th Symposium on Computer Applications & Industrial Electronics (ISCAIE). :246–251.

People are increasingly sharing their details online as internet usage grows. Therefore, fraudsters have access to a massive amount of information and financial activities. The attackers create web pages that seem like reputable sites and transmit the malevolent content to victims to get them to provide subtle information. Prevailing phishing security measures are inadequate for detecting new phishing assaults. To accomplish this aim, objective to meet for this research is to analyses and compare phishing website and legitimate by analyzing the data collected from open-source platforms through a survey. Another objective for this research is to propose a method to detect fake sites using Decision Tree and Random Forest approaches. Microsoft Form has been utilized to carry out the survey with 30 participants. Majority of the participants have poor awareness and phishing attack and does not obverse the features of interface before accessing the search browser. With the data collection, this survey supports the purpose of identifying the best phishing website detection where Decision Tree and Random Forest were trained and tested. In achieving high number of feature importance detection and accuracy rate, the result demonstrates that Random Forest has the best performance in phishing website detection compared to Decision Tree.

2023-01-20

Silva, Cátia, Faria, Pedro, Vale, Zita. 2022. Using Supervised Learning to Assign New Consumers to Demand Response Programs According to the Context. 2022 IEEE International Conference on Environment and Electrical Engineering and 2022 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe). :1—6.

Active consumers have now been empowered thanks to the smart grid concept. To avoid fossil fuels, the demand side must provide flexibility through Demand Response events. However, selecting the proper participants for an event can be complex due to response uncertainty. The authors design a Contextual Consumer Rate to identify the trustworthy participants according to previous performances. In the present case study, the authors address the problem of new players with no information. In this way, two different methods were compared to predict their rate. Besides, the authors also refer to the consumer privacy testing of the dataset with and without information that could lead to the participant identification. The results found to prove that, for the proposed methodology, private information does not have a high impact to attribute a rate.

2023-01-06

Rasch, Martina, Martino, Antonio, Drobics, Mario, Merenda, Massimo. 2022. Short-Term Time Series Forecasting based on Edge Machine Learning Techniques for IoT devices. 2022 7th International Conference on Smart and Sustainable Technologies (SpliTech). :1—5.

As the effects of climate change are becoming more and more evident, the importance of improved situation awareness is also gaining more attention, both in the context of preventive environmental monitoring and in the context of acute crisis response. One important aspect of situation awareness is the correct and thorough monitoring of air pollutants. The monitoring is threatened by sensor faults, power or network failures, or other hazards leading to missing or incorrect data transmission. For this reason, in this work we propose two complementary approaches for predicting missing sensor data and a combined technique for detecting outliers. The proposed solution can enhance the performance of low-cost sensor systems, closing the gap of missing measurements due to network unavailability, detecting drift and outliers thus paving the way to its use as an alert system for reportable events. The techniques have been deployed and tested also in a low power microcontroller environment, verifying the suitability of such a computing power to perform the inference locally, leading the way to an edge implementation of a virtual sensor digital twin.

2023-01-05

Kumar, Marri Ranjith, Malathi, K.. 2022. An Innovative Method in Improving the accuracy in Intrusion detection by comparing Random Forest over Support Vector Machine. 2022 International Conference on Business Analytics for Technology and Security (ICBATS). :1—6.

Improving the accuracy of intruders in innovative Intrusion detection by comparing Machine Learning classifiers such as Random Forest (RF) with Support Vector Machine (SVM). Two groups of supervised Machine Learning algorithms acquire perfection by looking at the Random Forest calculation (N=20) with the Support Vector Machine calculation (N=20)G power value is 0.8. Random Forest (99.3198%) has the highest accuracy than the SVM (9S.56l5%) and the independent T-test was carried out (=0.507) and shows that it is statistically insignificant (p \textgreater0.05) with a confidence value of 95% by comparing RF and SVM. Conclusion: The comparative examination displays that the Random Forest is more productive than the Support Vector Machine for identifying the intruders are significantly tested.

2022-10-12

Li, Chunzhi. 2021. A Phishing Detection Method Based on Data Mining. 2021 3rd International Conference on Applied Machine Learning (ICAML). :202—205.

Data mining technology is a very important technology in the current era of data explosion. With the informationization of society and the transparency and openness of information, network security issues have become the focus of concern of people all over the world. This paper wants to compare the accuracy of multiple machine learning methods and two deep learning frameworks when using lexical features to detect and classify malicious URLs. As a result, this paper shows that the Random Forest, which is an ensemble learning method for classification, is superior to 8 other machine learning methods in this paper. Furthermore, the Random Forest is even superior to some popular deep neural network models produced by famous frameworks such as TensorFlow and PyTorch when using lexical features to detect and classify malicious URLs.

Kumar, Yogendra, Subba, Basant. 2021. A lightweight machine learning based security framework for detecting phishing attacks. 2021 International Conference on COMmunication Systems & NETworkS (COMSNETS). :184—188.

A successful phishing attack is prelude to various other severe attacks such as login credentials theft, unauthorized access to user’s confidential data, malware and ransomware infestation of victim’s machine etc. This paper proposes a real time lightweight machine learning based security framework for detection of phishing attacks through analysis of Uniform Resource Locators (URLs). The proposed framework initially extracts a set of highly discriminating and uncorrelated features from the URL string corpus. These extracted features are then used to transform the URL strings into their corresponding numeric feature vectors, which are eventually used to train various machine learning based classifier models for identification of malicious phishing URLs. Performance analysis of the proposed security framework on two well known datasets: Kaggle dataset and UNB dataset shows that it is capable of detecting malicious phishing URLs with high precision, while at the same time maintain a very low level of false positive rate. The proposed framework is also shown to outperform other similar security frameworks proposed in the literature.121https://www.kaggle.com/antonyj453/ur1dataset2https://www.unb.ca/cic/datasets/ur1-2016.htm1

2022-07-01

Hashim, Aya, Medani, Razan, Attia, Tahani Abdalla. 2021. Defences Against web Application Attacks and Detecting Phishing Links Using Machine Learning. 2020 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE). :1–6.

In recent years web applications that are hacked every day estimated to be 30 000, and in most cases, web developers or website owners do not even have enough knowledge about what is happening on their sites. Web hackers can use many attacks to gain entry or compromise legitimate web applications, they can also deceive people by using phishing sites to collect their sensitive and private information. In response to this, the need is raised to take proper measures to understand the risks and be aware of the vulnerabilities that may affect the website and hence the normal business flow. In the scope of this study, mitigations against the most common web application attacks are set, and the web administrator is provided with ways to detect phishing links which is a social engineering attack, the study also demonstrates the generation of web application logs that simplifies the process of analyzing the actions of abnormal users to show when behavior is out of bounds, out of scope, or against the rules. The methods of mitigation are accomplished by secure coding techniques and the methods for phishing link detection are performed by various machine learning algorithms and deep learning techniques. The developed application has been tested and evaluated against various attack scenarios, the outcomes obtained from the test process showed that the website had successfully mitigated these dangerous web application attacks, and for the detection of phishing links part, a comparison is made between different algorithms to find the best one, and the outcome of the best model gave 98% accuracy.

2022-05-19

Kurihara, Tatsuki, Togawa, Nozomu. 2021. Hardware-Trojan Classification based on the Structure of Trigger Circuits Utilizing Random Forests. 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design (IOLTS). :1–4.

Recently, with the spread of Internet of Things (IoT) devices, embedded hardware devices have been used in a variety of everyday electrical items. Due to the increased demand for embedded hardware devices, some of the IC design and manufacturing steps have been outsourced to third-party vendors. Since malicious third-party vendors may insert malicious circuits, called hardware Trojans, into their products, developing an effective hardware Trojan detection method is strongly required. In this paper, we propose 25 hardware-Trojan features based on the structure of trigger circuits for machine-learning-based hardware Trojan detection. Combining the proposed features into 11 existing hardware-Trojan features, we totally utilize 36 hardware-Trojan features for classification. Then we classify the nets in an unknown netlist into a set of normal nets and Trojan nets based on the random-forest classifier. The experimental results demonstrate that the average true positive rate (TPR) becomes 63.6% and the average true negative rate (TNR) becomes 100.0%. They improve the average TPR by 14.7 points while keeping the average TNR compared to existing state-of-the-art methods. In particular, the proposed method successfully finds out Trojan nets in several benchmark circuits, which are not found by the existing method.

2022-05-06

Chen, Liiie, Guan, Qihan, Chen, Ning, YiHang, Zhou. 2021. A StackNet Based Model for Fraud Detection. 2021 2nd International Conference on Education, Knowledge and Information Management (ICEKIM). :328–331.

With the rapid development of e-commerce and the increasing popularity of credit cards, online transactions have become increasingly smooth and convenient. However, many online transactions suffer from credit card fraud, resulting in huge losses every year. Many financial organizations and e-commerce companies are devoted to developing advanced fraud detection algorithms. This paper presents an approach to detect fraud transactions using the IEEE-CIS Fraud Detection dataset provided by Kaggle. Our stacked model is based on Gradient Boosting, LightGBM, CatBoost, and Random Forest. Besides, implementing StackNet improves the classification accuracy significantly and provides expandability to the network architecture. Our final model achieved an AUC of 0.9578 for the training set and 0.9325 for the validation set, demonstrating excellent performance in classifying different transaction types.

2022-04-19

Perumal, Seethalakshmi, Sujatha P, Kola. 2021. Stacking Ensemble-based XSS Attack Detection Strategy Using Classification Algorithms. 2021 6th International Conference on Communication and Electronics Systems (ICCES). :897–901.

The accessibility of the internet and mobile platforms has risen dramatically due to digital technology innovations. Web applications have opened up a variety of market possibilities by supplying consumers with a wide variety of digital technologies that benefit from high accessibility and functionality. Around the same time, web application protection continues to be an important challenge on the internet, and security must be taken seriously in order to secure confidential data. The threat is caused by inadequate validation of user input information, software developed without strict adherence to safety standards, vulnerability of reusable software libraries, software weakness, and so on. Through abusing a website's vulnerability, introduers are manipulating the user's information in order to exploit it for their own benefit. Then introduers inject their own malicious code, stealing passwords, manipulating user activities, and infringing on customers' privacy. As a result, information is leaked, applications malfunction, confidential data is accessed, etc. To mitigate the aforementioned issues, stacking ensemble based classifier model for Cross-site scripting (XSS) attack detection is proposed. Furthermore, the stacking ensembles technique is used in combination with different machine learning classification algorithms like k-Means, Random Forest and Decision Tree as base-learners to reliably detect XSS attack. Logistic Regression is used as meta-learner to predict the attack with greater accuracy. The classification algorithms in stacking model explore the problem in their own way and its results are given as input to the meta-learner to make final prediction, thus improving the overall detection accuracy of XSS attack in stacking than the individual models. The simulation findings demonstrate that the proposed model detects XSS attack successfully.

2022-03-23

Gattineni, Pradeep, Dharan, G.R Sakthi. 2021. Intrusion Detection Mechanisms: SVM, random forest, and extreme learning machine (ELM). 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA). :273–276.

Intrusion detection method cautions and through build recognition rate. Through determine worries forth execution support vector machine (SVM), multilayer perceptron and different procedures have endured utilized trig ongoing work. Such strategies show impediments & persist not effective considering use trig enormous informational indexes, considering example, outline & system information. Interruption recognition outline utilized trig examining colossal traffic information; consequently, a proficient grouping strategy important through beat issue. Aforementioned issue considered trig aforementioned paper. Notable AI methods, specifically, SVM, arbitrary backwoods, & extreme learning machine (ELM) persist applied. These procedures persist notable trig view epithetical their capacity trig characterization. NSL-information revelation & knowledge mining informational collection components. Outcomes demonstrate a certain ELM beats different methodologies.

2022-02-25

Abutaha, Mohammed, Ababneh, Mohammad, Mahmoud, Khaled, Baddar, Sherenaz Al-Haj. 2021. URL Phishing Detection using Machine Learning Techniques based on URLs Lexical Analysis. 2021 12th International Conference on Information and Communication Systems (ICICS). :147—152.

Phishing URLs mainly target individuals and/or organizations through social engineering attacks by exploiting the humans' weaknesses in information security awareness. These URLs lure online users to access fake websites, and harvest their confidential information, such as debit/credit card numbers and other sensitive information. In this work, we introduce a phishing detection technique based on URL lexical analysis and machine learning classifiers. The experiments were carried out on a dataset that originally contained 1056937 labeled URLs (phishing and legitimate). This dataset was processed to generate 22 different features that were reduced further to a smaller set using different features reduction techniques. Random Forest, Gradient Boosting, Neural Network and Support Vector Machine (SVM) classifiers were all evaluated, and results show the superiority of SVMs, which achieved the highest accuracy in detecting the analyzed URLs with a rate of 99.89%. Our approach can be incorporated within add-on/middleware features in Internet browsers for alerting online users whenever they try to access a phishing website using only its URL.

2022-02-07

Khalifa, Marwa Mohammed, Ucan, Osman Nuri, Ali Alheeti, Khattab M.. 2021. New Intrusion Detection System to Protect MANET Networks Employing Machine Learning Techniques. 2021 International Conference of Modern Trends in Information and Communication Technology Industry (MTICTI). :1–6.

The Intrusion Detection System (IDS) is one of the technologies available to protect mobile ad hoc networks. The system monitors the network and detects intrusion from malicious nodes, aiming at passive (eavesdropping) or positive attack to disrupt the network. This paper proposes a new Intrusion detection system using three Machine Learning (ML) techniques. The ML techniques were Random Forest (RF), support vector machines (SVM), and Naïve Bayes(NB) were used to classify nodes in MANET. The data set was generated by the simulator network simulator-2 (NS-2). The routing protocol was used is Dynamic Source Routing (DSR). The type of IDS used is a Network Intrusion Detection System (NIDS). The dataset was pre-processed, then split into two subsets, 67% for training and 33% for testing employing Python Version 3.8.8. Obtaining good results for RF, SVM and NB when applied randomly selected features in the trial and error method from the dataset to improve the performance of the IDS and reduce time spent for training and testing. The system showed promising results, especially with RF, where the accuracy rate reached 100%.

Osman, Mohd Zamri, Abidin, Ahmad Firdaus Zainal, Romli, Rahiwan Nazar, Darmawan, Mohd Faaizie. 2021. Pixel-based Feature for Android Malware Family Classification using Machine Learning Algorithms. 2021 International Conference on Software Engineering Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM). :552–555.

‘Malicious software’ or malware has been a serious threat to the security and privacy of all mobile phone users. Due to the popularity of smartphones, primarily Android, this makes them a very viable target for spreading malware. In the past, many solutions have proved ineffective and have resulted in many false positives. Having the ability to identify and classify malware will help prevent them from spreading and evolving. In this paper, we study the effectiveness of the proposed classification of the malware family using a pixel level as features. This study has implemented well-known machine learning and deep learning classifiers such as K-Nearest Neighbours (k-NN), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree, and Random Forest. A binary file of 25 malware families is converted into a fixed grayscale image. The grayscale images were then extracted transforming the size 100x100 into a single format into 100000 columns. During this phase, none of the columns are removed as to remain the patterns in each malware family. The experimental results show that our approach achieved 92% accuracy in Random Forest, 88% in SVM, 81% in Decision Tree, 80% in k-NN and 56% in Naïve Bayes classifier. Overall, the pixel-based feature also reveals a promising technique for identifying the family of malware with great accuracy, especially using the Random Forest classifier.

Acharya, Jatin, Chuadhary, Anshul, Chhabria, Anish, Jangale, Smita. 2021. Detecting Malware, Malicious URLs and Virus Using Machine Learning and Signature Matching. 2021 2nd International Conference for Emerging Technology (INCET). :1–5.

Nowadays most of our data is stored on an electronic device. The risk of that device getting infected by Viruses, Malware, Worms, Trojan, Ransomware, or any unwanted invader has increased a lot these days. This is mainly because of easy access to the internet. Viruses and malware have evolved over time so identification of these files has become difficult. Not only by viruses and malware your device can be attacked by a click on forged URLs. Our proposed solution for this problem uses machine learning techniques and signature matching techniques. The main aim of our solution is to identify the malicious programs/URLs and act upon them. The core idea in identifying the malware is selecting the key features from the Portable Executable file headers using these features we trained a random forest model. This RF model will be used for scanning a file and determining if that file is malicious or not. For identification of the virus, we are using the signature matching technique which is used to match the MD5 hash of the file with the virus signature database containing the MD5 hash of the identified viruses and their families. To distinguish between benign and illegitimate URLs there is a logistic regression model used. The regression model uses a tokenizer for feature extraction from the URL that is to be classified. The tokenizer separates all the domains, sub-domains and separates the URLs on every `/'. Then a TfidfVectorizer (Term Frequency - Inverse Document Frequency) is used to convert the text into a weighted value. These values are used to predict if the URL is safe to visit or not. On the integration of all three modules, the final application will provide full system protection against malicious software.

2021-11-08

Afroz, Sabrina, Ariful Islam, S.M, Nawer Rafa, Samin, Islam, Maheen. 2020. A Two Layer Machine Learning System for Intrusion Detection Based on Random Forest and Support Vector Machine. 2020 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE). :300–303.

Unauthorized access or intrusion is a massive threatening issue in the modern era. This study focuses on designing a model for an ideal intrusion detection system capable of defending a network by alerting the admins upon detecting any sorts of malicious activities. The study proposes a two layered anomaly-based detection model that uses filter co-relation method for dimensionality reduction along with Random forest and Support Vector Machine as its classifiers. It achieved a very good detection rate against all sorts of attacks including a low rate of false alarms as well. The contribution of this study is that it could be of a major help to the computer scientists designing good intrusion detection systems to keep an industry or organization safe from the cyber threats as it has achieved the desired qualities of a functional IDS model.

2021-09-21

Brzezinski Meyer, Maria Laura, Labit, Yann. 2020. Combining Machine Learning and Behavior Analysis Techniques for Network Security. 2020 International Conference on Information Networking (ICOIN). :580–583.

Network traffic attacks are increasingly common and varied, this is a big problem especially when the target network is centralized. The creation of IDS (Intrusion Detection Systems) capable of detecting various types of attacks is necessary. Machine learning algorithms are widely used in the classification of data, bringing a good result in the area of computer networks. In addition, the analysis of entropy and distance between data sets are also very effective in detecting anomalies. However, each technique has its limitations, so this work aims to study their combination in order to improve their performance and create a new intrusion detection system capable of well detect some of the most common attacks. Reliability indices will be used as metrics to the combination decision and they will be updated in each new dataset according to the decision made earlier.

2021-09-07

Manikumar, D.V.V.S., Maheswari, B Uma. 2020. Blockchain Based DDoS Mitigation Using Machine Learning Techniques. 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA). :794–800.

DDoS attacks are the most commonly performed cyber-attacks with a motive to suspend the target services and making them unavailable to users. A recent attack on Github, explains that the traffic was traced back to ``over a thousand different autonomous systems across millions of unique endpoints''. Generally, there are various types of DDoS attacks and each attack uses a different protocol and attacker uses a botnet to execute such attacks. Hence, it will be very difficult for organizations to deal with these attacks and going for third parties to secure themselves from DDoS attacks. In order to eliminate the third parties. Our proposed system uses machine learning algorithms to identify the incoming packet is malicious or not and use Blockchain technology to store the Blacklist. The key benefit of Blockchain is that blacklisted IP addresses are effectively stored, and usage of such infrastructure provides an advantage of extra security mechanism over existing DDoS mitigation systems. This paper has evaluated three different algorithms, such as the KNN Classifier, the Decision Tree Classifier, Random Forest algorithm to find out the better classifying algorithm. Tree Based Classifier technique used for Feature Selection to boost the computational time. Out of the three algorithms, Random Forest provides an accuracy about 95 % in real-time traffic analysis.

2021-08-31

Ge, Chonghui, Sun, Jian, Sun, Yuxin, Di, Yunlong, Zhu, Yongjin, Xie, Linfeng, Zhang, Yingzhou. 2020. Reversible Database Watermarking Based on Random Forest and Genetic Algorithm. 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :239—247.

The advancing information technology is playing more and more important role in data mining of relational database.1 The transfer and sharing of databases cause the copyright-related security threats. Database watermarking technology can effectively solve the problem with copyright protection and traceability, which has been attracting researchers' attention. In this paper, we proposed a novel, robust and reversible database watermarking technique, named histogram shifting watermarking based on random forest and genetic algorithm (RF-GAHCSW). It greatly improves the watermark capacity by means of histogram width reduction and eliminates the impact of the prediction error attack. Meanwhile, random forest algorithm is used to select important attributes for watermark embedding, and genetic algorithm is employed to find the optimal secret key for the database grouping and determine the position of watermark embedding to improve the watermark capacity and reduce data distortion. The experimental results show that the robustness of RF-GAHCSW is greatly improved, compared with the original HSW, and the distortion has little effect on the usability of database.

2021-03-30

Lin, T.-H., Jiang, J.-R.. 2020. Anomaly Detection with Autoencoder and Random Forest. 2020 International Computer Symposium (ICS). :96—99.

This paper proposes AERFAD, an anomaly detection method based on the autoencoder and the random forest, for solving the credit card fraud detection problem. The proposed AERFAD first utilizes the autoencoder to reduce the dimensionality of data and then uses the random forest to classify data as anomalous or normal. Large numbers of credit card transaction data of European cardholders are applied to AEFRAD to detect possible frauds for the sake of performance evaluation. When compared with related methods, AERFAD has relatively excellent performance in terms of the accuracy, true positive rate, true negative rate, and Matthews correlation coefficient.

2021-03-09

Muhammad, A., Asad, M., Javed, A. R.. 2020. Robust Early Stage Botnet Detection using Machine Learning. 2020 International Conference on Cyber Warfare and Security (ICCWS). :1—6.

Among the different types of malware, botnets are rising as the most genuine risk against cybersecurity as they give a stage to criminal operations (e.g., Distributed Denial of Service (DDOS) attacks, malware dispersal, phishing, and click fraud and identity theft). Existing botnet detection techniques work only on specific botnet Command and Control (C&C) protocols and lack in providing early-stage botnet detection. In this paper, we propose an approach for early-stage botnet detection. The proposed approach first selects the optimal features using feature selection techniques. Next, it feeds these features to machine learning classifiers to evaluate the performance of the botnet detection. Experiments reveals that the proposed approach efficiently classifies normal and malicious traffic at an early stage. The proposed approach achieves the accuracy of 99%, True Positive Rate (TPR) of 0.99 %, and False Positive Rate (FPR) of 0.007 % and provide an efficient detection rate in comparison with the existing approach.

2021-02-16

Başkaya, D., Samet, R.. 2020. DDoS Attacks Detection by Using Machine Learning Methods on Online Systems. 2020 5th International Conference on Computer Science and Engineering (UBMK). :52—57.

DDoS attacks impose serious threats to many large or small organizations; therefore DDoS attacks have to be detected as soon as possible. In this study, a methodology to detect DDoS attacks is proposed and implemented on online systems. In the scope of the proposed methodology, Multi Layer Perceptron (MLP), Random Forest (RF), K-Nearest Neighbor (KNN), C-Support Vector Machine (SVC) machine learning methods are used with scaling and feature reduction preprocessing methods and then effects of preprocesses on detection accuracy rates of HTTP (Hypertext Transfer Protocol) flood, TCP SYN (Transport Control Protocol Synchronize) flood, UDP (User Datagram Protocol) flood and ICMP (Internet Control Message Protocol) flood DDoS attacks are analyzed. Obtained results showed that DDoS attacks can be detected with high accuracy of 99.2%.