Visible to the public Biblio

Found 1057 results

Filters: Keyword is machine learning  [Clear All Filters]
2022-06-14
Schneider, Madeleine, Aspinall, David, Bastian, Nathaniel D..  2021.  Evaluating Model Robustness to Adversarial Samples in Network Intrusion Detection. 2021 IEEE International Conference on Big Data (Big Data). :3343–3352.
Adversarial machine learning, a technique which seeks to deceive machine learning (ML) models, threatens the utility and reliability of ML systems. This is particularly relevant in critical ML implementations such as those found in Network Intrusion Detection Systems (NIDS). This paper considers the impact of adversarial influence on NIDS and proposes ways to improve ML based systems. Specifically, we consider five feature robustness metrics to determine which features in a model are most vulnerable, and four defense methods. These methods are tested on six ML models with four adversarial sample generation techniques. Our results show that across different models and adversarial generation techniques, there is limited consistency in vulnerable features or in effectiveness of defense method.
Kim, Seongsoo, Chen, Lei, Kim, Jongyeop.  2021.  Intrusion Prediction using Long Short-Term Memory Deep Learning with UNSW-NB15. 2021 IEEE/ACIS 6th International Conference on Big Data, Cloud Computing, and Data Science (BCD). :53–59.
This study shows the effectiveness of anomaly-based IDS using long short-term memory(LSTM) based on the newly developed dataset called UNSW-NB15 while considering root mean square error and mean absolute error as evaluation metrics for accuracy. For each attack, 80% and 90% of samples were used as LSTM inputs and trained this model while increasing epoch values. Furthermore, this model has predicted attack points by applying test data and produced possible attack points for each attack at the 3rd time frame against the actual attack point. However, in the case of an Exploit attack, the consecutive overlapping attacks happen, there was ambiguity in the interpretation of the numerical values calculated by the LSTM. We presented a methodology for training data with binary values using LSTM and evaluation with RMSE metrics throughout this study.
2022-06-13
Gupta, B. B., Gaurav, Akshat, Peraković, Dragan.  2021.  A Big Data and Deep Learning based Approach for DDoS Detection in Cloud Computing Environment. 2021 IEEE 10th Global Conference on Consumer Electronics (GCCE). :287–290.
Recently, as a result of the COVID-19 pandemic, the internet service has seen an upsurge in use. As a result, the usage of cloud computing apps, which offer services to end users on a subscription basis, rises in this situation. However, the availability and efficiency of cloud computing resources are impacted by DDoS attacks, which are designed to disrupt the availability and processing power of cloud computing services. Because there is no effective way for detecting or filtering DDoS attacks, they are a dependable weapon for cyber-attackers. Recently, researchers have been experimenting with machine learning (ML) methods in order to create efficient machine learning-based strategies for detecting DDoS assaults. In this context, we propose a technique for detecting DDoS attacks in a cloud computing environment using big data and deep learning algorithms. The proposed technique utilises big data spark technology to analyse a large number of incoming packets and a deep learning machine learning algorithm to filter malicious packets. The KDDCUP99 dataset was used for training and testing, and an accuracy of 99.73% was achieved.
2022-06-10
Ge, Yurun, Bertozzi, Andrea L..  2021.  Active Learning for the Subgraph Matching Problem. 2021 IEEE International Conference on Big Data (Big Data). :2641–2649.
The subgraph matching problem arises in a number of modern machine learning applications including segmented images and meshes of 3D objects for pattern recognition, bio-chemical reactions and security applications. This graph-based problem can have a very large and complex solution space especially when the world graph has many more nodes and edges than the template. In a real use-case scenario, analysts may need to query additional information about template nodes or world nodes to reduce the problem size and the solution space. Currently, this query process is done by hand, based on the personal experience of analysts. By analogy to the well-known active learning problem in machine learning classification problems, we present a machine-based active learning problem for the subgraph match problem in which the machine suggests optimal template target nodes that would be most likely to reduce the solution space when it is otherwise overly large and complex. The humans in the loop can then include additional information about those target nodes. We present some case studies for both synthetic and real world datasets for multichannel subgraph matching.
Ramachandran, Gowri Sankar, Deane, Felicity, Malik, Sidra, Dorri, Ali, Jurdak, Raja.  2021.  Towards Assisted Autonomy for Supply Chain Compliance Management. 2021 Third IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :321–330.

In an agricultural supply chain, farmers, food processors, transportation agencies, importers, and exporters must comply with different regulations imposed by one or more jurisdictions depending on the nature of their business operations. Supply chain stakeholders conventionally transport their goods, along with the corresponding documentation via regulators for compliance checks. This is generally followed by a tedious and manual process to ensure the goods meet regulatory requirements. However, supply chain systems are changing through digitization. In digitized supply chains, data is shared with the relevant stakeholders through digital supply chain platforms, including blockchain technology. In such datadriven digital supply chains, the regulators may be able to leverage digital technologies, such as artificial intelligence and machine learning, to automate the compliance verification process. However, a barrier to progress is the risk that information will not be credible, thus reversing the gains that automation could achieve. Automating compliance based on inaccurate data may compromise the safety and credibility of the agricultural supply chain, which discourages regulators and other stakeholders from adopting and relying on automation. Within this article we consider the challenges of digital supply chains when we describe parts of the compliance management process and how it can be automated to improve the operational efficiency of agricultural supply chains. We introduce assisted autonomy as a means to pragmatically automate the compliance verification process by combining the power of digital systems while keeping the human in-the-loop. We argue that autonomous compliance is possible, but that the need for human led inspection processes will never be replaced by machines, however it can be minimised through “assisted autonomy”.

2022-06-09
Yamamoto, Moeka, Kakei, Shohei, Saito, Shoichi.  2021.  FirmPot: A Framework for Intelligent-Interaction Honeypots Using Firmware of IoT Devices. 2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW). :405–411.
IoT honeypots that mimic the behavior of IoT devices for threat analysis are becoming increasingly important. Existing honeypot systems use devices with a specific version of firmware installed to monitor cyber attacks. However, honeypots frequently receive requests targeting devices and firmware that are different from themselves. When honeypots return an error response to such a request, the attack is terminated, and the monitoring fails.To solve this problem, we introduce FirmPot, a framework that automatically generates intelligent-interaction honeypots using firmware. This framework has a firmware emulator optimized for honeypot generation and learns the behavior of embedded applications by using machine learning. The generated honeypots continue to interact with attackers by a mechanism that returns the best from the emulated responses to the attack request instead of an error response.We experimented on embedded web applications of wireless routers based on the open-source OpenWrt. As a result, our framework generated honeypots that mimicked the embedded web applications of eight vendors and ten different CPU architectures. Furthermore, our approach to the interaction improved the session length with attackers compared to existing ones.
Cobb, Adam D., Jalaian, Brian A., Bastian, Nathaniel D., Russell, Stephen.  2021.  Robust Decision-Making in the Internet of Battlefield Things Using Bayesian Neural Networks. 2021 Winter Simulation Conference (WSC). :1–12.
The Internet of Battlefield Things (IoBT) is a dynamically composed network of intelligent sensors and actuators that operate as a command and control, communications, computers, and intelligence complex-system with the aim to enable multi-domain operations. The use of artificial intelligence can help transform the IoBT data into actionable insight to create information and decision advantage on the battlefield. In this work, we focus on how accounting for uncertainty in IoBT systems can result in more robust and safer systems. Human trust in these systems requires the ability to understand and interpret how machines make decisions. Most real-world applications currently use deterministic machine learning techniques that cannot incorporate uncertainty. In this work, we focus on the machine learning task of classifying vehicles from their audio recordings, comparing deterministic convolutional neural networks (CNNs) with Bayesian CNNs to show that correctly estimating the uncertainty can help lead to robust decision-making in IoBT.
Karim, Hassan, Rawat, Danda B..  2021.  Evaluating Machine Learning Classifiers for Data Sharing in Internet of Battlefield Things. 2021 IEEE Symposium Series on Computational Intelligence (SSCI). :01–07.
The most widely used method to prevent adversaries from eavesdropping on sensitive sensor, robot, and war fighter communications is mathematically strong cryptographic algorithms. However, prevailing cryptographic protocol mandates are often made without consideration of resource constraints of devices in the internet of Battlefield Things (IoBT). In this article, we address the challenges of IoBT sensor data exchange in contested environments. Battlefield IoT (Internet of Things) devices need to exchange data and receive feedback from other devices such as tanks and command and control infrastructure for analysis, tracking, and real-time engagement. Since data in IoBT systems may be massive or sparse, we introduced a machine learning classifier to determine what type of data to transmit under what conditions. We compared Support Vector Machine, Bayes Point Match, Boosted Decision Trees, Decision Forests, and Decision Jungles on their abilities to recommend the optimal confidentiality preserving data and transmission path considering dynamic threats. We created a synthesized dataset that simulates platoon maneuvers and IED detection components. We found Decision Jungles to produce the most accurate results while requiring the least resources during training to produce those results. We also introduced the JointField blockchain network for joint and allied force data sharing. With our classifier, strategists, and system designers will be able to enable adaptive responses to threats while engaged in real-time field conflict.
Hoarau, Kevin, Tournoux, Pierre Ugo, Razafindralambo, Tahiry.  2021.  Suitability of Graph Representation for BGP Anomaly Detection. 2021 IEEE 46th Conference on Local Computer Networks (LCN). :305–310.
The Border Gateway Protocol (BGP) is in charge of the route exchange at the Internet scale. Anomalies in BGP can have several causes (mis-configuration, outage and attacks). These anomalies are classified into large or small scale anomalies. Machine learning models are used to analyze and detect anomalies from the complex data extracted from BGP behavior. Two types of data representation can be used inside the machine learning models: a graph representation of the network (graph features) or a statistical computation on the data (statistical features). In this paper, we evaluate and compare the accuracy of machine learning models using graph features and statistical features on both large and small scale BGP anomalies. We show that statistical features have better accuracy for large scale anomalies, and graph features increase the detection accuracy by 15% for small scale anomalies and are well suited for BGP small scale anomaly detection.
Atluri, Venkata, Horne, Jeff.  2021.  A Machine Learning based Threat Intelligence Framework for Industrial Control System Network Traffic Indicators of Compromise. SoutheastCon 2021. :1–5.
Cyber-attacks on our Nation's Critical Infrastructure are growing. In this research, a Cyber Threat Intelligence (CTI) framework is proposed, developed, and tested. The results of the research, using 5 different simulated attacks on a dataset from an Industrial Control System (ICS) testbed, are presented with the extracted IOCs. The Bagging Decision Trees model showed the highest performance of testing accuracy (94.24%), precision (0.95), recall (0.93), and F1-score (0.94) among the 9 different machine learning models studied.
Pyatnitsky, Ilya A., Sokolov, Alexander N..  2021.  Determination of the Optimal Ratio of Normal to Anomalous Points in the Problem of Detecting Anomalies in the Work of Industrial Control Systems. 2021 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT). :0478–0480.

Algorithms for unsupervised anomaly detection have proven their effectiveness and flexibility, however, first it is necessary to calculate with what ratio a certain class begins to be considered anomalous by the autoencoder. For this reason, we propose to conduct a study of the efficiency of autoencoders depending on the ratio of anomalous and non-anomalous classes. The emergence of high-speed networks in electric power systems creates a tight interaction of cyberinfrastructure with the physical infrastructure and makes the power system susceptible to cyber penetration and attacks. To address this problem, this paper proposes an innovative approach to develop a specification-based intrusion detection framework that leverages available information provided by components in a contemporary power system. An autoencoder is used to encode the causal relations among the available information to create patterns with temporal state transitions, which are used as features in the proposed intrusion detection. This allows the proposed method to detect anomalies and cyber attacks.

Ali, Jokha.  2021.  Intrusion Detection Systems Trends to Counteract Growing Cyber-Attacks on Cyber-Physical Systems. 2021 22nd International Arab Conference on Information Technology (ACIT). :1–6.
Cyber-Physical Systems (CPS) suffer from extendable vulnerabilities due to the convergence of the physical world with the cyber world, which makes it victim to a number of sophisticated cyber-attacks. The motives behind such attacks range from criminal enterprises to military, economic, espionage, political, and terrorism-related activities. Many governments are more concerned than ever with securing their critical infrastructure. One of the effective means of detecting threats and securing their infrastructure is the use of Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS). A number of studies have been conducted and proposed to assess the efficacy and effectiveness of IDS through the use of self-learning techniques, especially in the Industrial Control Systems (ICS) era. This paper investigates and analyzes the utilization of IDS systems and their proposed solutions used to enhance the effectiveness of such systems for CPS. The targeted data extraction was from 2011 to 2021 from five selected sources: IEEE, ACM, Springer, Wiley, and ScienceDirect. After applying the inclusion and exclusion criteria, 20 primary studies were selected from a total of 51 studies in the field of threat detection in CPS, ICS, SCADA systems, and the IoT. The outcome revealed the trends in recent research in this area and identified essential techniques to improve detection performance, accuracy, reliability, and robustness. In addition, this study also identified the most vulnerable target layer for cyber-attacks in CPS. Various challenges, opportunities, and solutions were identified. The findings can help scholars in the field learn about how machine learning (ML) methods are used in intrusion detection systems. As a future direction, more research should explore the benefits of ML to safeguard cyber-physical systems.
Philipsen, Simon Grønfeldt, Andersen, Birger, Singh, Bhupjit.  2021.  Threats and Attacks to Modern Vehicles. 2021 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS). :22–27.
As modern vehicles are complex IoT devices with intelligence capable to connect to an external infrastructure and use Vehicle-to-Everything (V2X) communication, there is a need to secure the communication to avoid being a target for cyber-attacks. Also, the organs of the car (sensors, communication, and control) each could have a vulnerability, that leads to accidents or potential deaths. Manufactures of cars have a huge responsibility to secure the safety of their costumers and should not skip the important security research, instead making sure to implement important security measures, which makes your car less likely to be attacked. This paper covers the relevant attacks and threats to modern vehicles and presents a security analysis with potential countermeasures. We discuss the future of modern and autonomous vehicles and conclude that more countermeasures must be taken to create a future and safe concept.
2022-06-06
Jobst, Matthias, Liu, Chen, Partzsch, Johannes, Yan, Yexin, Kappel, David, Gonzalez, Hector A., Ji, Yue, Vogginger, Bernhard, Mayr, Christian.  2020.  Event-based Neural Network for ECG Classification with Delta Encoding and Early Stopping. 2020 6th International Conference on Event-Based Control, Communication, and Signal Processing (EBCCSP). :1–4.
We present a scalable architecture based on a trained filter bank for input pre-processing and a recurrent neural network (RNN) for the detection of atrial fibrillation in electrocardiogram (ECG) signals, with the focus on enabling a very efficient hardware implementation as application-specific integrated circuit (ASIC). Our already very efficient base architecture is further improved by replacing the RNN with a delta-encoded gated recurrent unit (GRU) and adding a confidence measure (CM) for terminating the computation as early as possible. With these optimizations, we demonstrate a reduction of the processing load of 58 % on an internal dataset while still achieving near state-of-the-art classification results on the Physionet ECG dataset with only 1202 parameters.
Boddy, Aaron, Hurst, William, Mackay, Michael, El Rhalibi, Abdennour.  2019.  A Hybrid Density-Based Outlier Detection Model for Privacy in Electronic Patient Record system. 2019 5th International Conference on Information Management (ICIM). :92–96.
This research concerns the detection of unauthorised access within hospital networks through the real-time analysis of audit logs. Privacy is a primary concern amongst patients due to the rising adoption of Electronic Patient Record (EPR) systems. There is growing evidence to suggest that patients may withhold information from healthcare providers due to lack of Trust in the security of EPRs. Yet, patient record data must be available to healthcare providers at the point of care. Ensuring privacy and confidentiality of that data is challenging. Roles within healthcare organisations are dynamic and relying on access control is not sufficient. Through proactive monitoring of audit logs, unauthorised accesses can be detected and presented to an analyst for review. Advanced data analytics and visualisation techniques can be used to aid the analysis of big data within EPR audit logs to identify and highlight pertinent data points. Employing a human-in-the-loop model ensures that suspicious activity is appropriately investigated and the data analytics is continuously improving. This paper presents a system that employs a Human-in-the-Loop Machine Learning (HILML) algorithm, in addition to a density-based local outlier detection model. The system is able to detect 145 anomalous behaviours in an unlabelled dataset of 1,007,727 audit logs. This equates to 0.014% of the EPR accesses being labelled as anomalous in a specialist Liverpool (UK) hospital.
Hung, Benjamin W.K., Muramudalige, Shashika R., Jayasumana, Anura P., Klausen, Jytte, Libretti, Rosanne, Moloney, Evan, Renugopalakrishnan, Priyanka.  2019.  Recognizing Radicalization Indicators in Text Documents Using Human-in-the-Loop Information Extraction and NLP Techniques. 2019 IEEE International Symposium on Technologies for Homeland Security (HST). :1–7.
Among the operational shortfalls that hinder law enforcement from achieving greater success in preventing terrorist attacks is the difficulty in dynamically assessing individualized violent extremism risk at scale given the enormous amount of primarily text-based records in disparate databases. In this work, we undertake the critical task of employing natural language processing (NLP) techniques and supervised machine learning models to classify textual data in analyst and investigator notes and reports for radicalization behavioral indicators. This effort to generate structured knowledge will build towards an operational capability to assist analysts in rapidly mining law enforcement and intelligence databases for cues and risk indicators. In the near-term, this effort also enables more rapid coding of biographical radicalization profiles to augment a research database of violent extremists and their exhibited behavioral indicators.
2022-05-24
Pellenz, Marcelo E., Lachowski, Rosana, Jamhour, Edgard, Brante, Glauber, Moritz, Guilherme Luiz, Souza, Richard Demo.  2021.  In-Network Data Aggregation for Information-Centric WSNs using Unsupervised Machine Learning Techniques. 2021 IEEE Symposium on Computers and Communications (ISCC). :1–7.
IoT applications are changing our daily lives. These innovative applications are supported by new communication technologies and protocols. Particularly, the information-centric network (ICN) paradigm is well suited for many IoT application scenarios that involve large-scale wireless sensor networks (WSNs). Even though the ICN approach can significantly reduce the network traffic by optimizing the process of information recovery from network nodes, it is also possible to apply data aggregation strategies. This paper proposes an unsupervised machine learning-based data aggregation strategy for multi-hop information-centric WSNs. The results show that the proposed algorithm can significantly reduce the ICN data traffic while having reduced information degradation.
2022-05-20
Kjamilji, Artrim, Levi, Albert, Savas, Erkay, Güney, Osman Berke.  2021.  Secure Matrix Operations for Machine Learning Classifications Over Encrypted Data in Post Quantum Industrial IoT. 2021 International Symposium on Networks, Computers and Communications (ISNCC). :1–8.
We tackle the problem where a server owns a trained Machine Learning (ML) model and a client/user has an unclassified query that he wishes to classify in secure and private fashion using the server’s model. During the process the server learns nothing, while the user learns only his final classification and nothing else. Since several ML classification algorithms, such as deep neural networks, support vector machines-SVM (and hyperplane decisions in general), Logistic Regression, Naïve Bayes, etc., can be expressed in terms of matrix operations, initially we propose novel secure matrix operations as our building blocks. On top of them we build our secure and private ML classification algorithms under strict security and privacy requirements. As our underlying cryptographic primitives are shown to be resilient to quantum computer attacks, our algorithms are also suitable for the post-quantum world. Our theoretical analysis and extensive experimental evaluations show that our secure matrix operations, hence our secure ML algorithms build on top of them as well, outperform the state of the art schemes in terms of computation and communication costs. This makes our algorithms suitable for devices with limited resources that are often found in Industrial IoT (Internet of Things)
2022-05-19
Zhang, Xiaoyu, Fujiwara, Takanori, Chandrasegaran, Senthil, Brundage, Michael P., Sexton, Thurston, Dima, Alden, Ma, Kwan-Liu.  2021.  A Visual Analytics Approach for the Diagnosis of Heterogeneous and Multidimensional Machine Maintenance Data. 2021 IEEE 14th Pacific Visualization Symposium (PacificVis). :196–205.
Analysis of large, high-dimensional, and heterogeneous datasets is challenging as no one technique is suitable for visualizing and clustering such data in order to make sense of the underlying information. For instance, heterogeneous logs detailing machine repair and maintenance in an organization often need to be analyzed to diagnose errors and identify abnormal patterns, formalize root-cause analyses, and plan preventive maintenance. Such real-world datasets are also beset by issues such as inconsistent and/or missing entries. To conduct an effective diagnosis, it is important to extract and understand patterns from the data with support from analytic algorithms (e.g., finding that certain kinds of machine complaints occur more in the summer) while involving the human-in-the-loop. To address these challenges, we adopt existing techniques for dimensionality reduction (DR) and clustering of numerical, categorical, and text data dimensions, and introduce a visual analytics approach that uses multiple coordinated views to connect DR + clustering results across each kind of the data dimension stated. To help analysts label the clusters, each clustering view is supplemented with techniques and visualizations that contrast a cluster of interest with the rest of the dataset. Our approach assists analysts to make sense of machine maintenance logs and their errors. Then the gained insights help them carry out preventive maintenance. We illustrate and evaluate our approach through use cases and expert studies respectively, and discuss generalization of the approach to other heterogeneous data.
Wang, Yuze, Liu, Peng, Han, Xiaoxia, Jiang, Yingtao.  2021.  Hardware Trojan Detection Method for Inspecting Integrated Circuits Based on Machine Learning. 2021 22nd International Symposium on Quality Electronic Design (ISQED). :432–436.
Nowadays malicious vendors can easily insert hardware Trojans into integrated circuit chips as the entire integrated chip supply chain involves numerous design houses and manufacturers on a global scale. It is thereby becoming a necessity to expose any possible hardware Trojans, if they ever exist in a chip. A typical Trojan circuit is made of a trigger and a payload that are interconnected with a trigger net. As trigger net can be viewed as the signature of a hardware Trojan, in this paper, we propose a gate-level hardware Trojan detection method and model that can be applied to screen the entire chip for trigger nets. In specific, we extract the trigger-net features for each net from known netlists and use the machine learning method to train multiple detection models according to the trigger modes. The detection models are used to identify suspicious trigger nets from the netlist of the integrated circuit under detection, and score each net in terms of suspiciousness value. By flagging the top 2% suspicious nets with the highest suspiciousness values, we shall be able to detect majority hardware Trojans, with an average accuracy rate of 96%.
Sharma, Anurag, Mohanty, Suman, Islam, Md. Ruhul.  2021.  An Experimental Analysis on Malware Detection in Executable Files using Machine Learning. 2021 8th International Conference on Smart Computing and Communications (ICSCC). :178–182.
In the recent time due to advancement of technology, Malware and its clan have continued to advance and become more diverse. Malware otherwise Malicious Software consists of Virus, Trojan horse, Adware, Spyware etc. This said software leads to extrusion of data (Spyware), continuously flow of Ads (Adware), modifying or damaging the system files (Virus), or access of personal information (Trojan horse). Some of the major factors driving the growth of these attacks are due to poorly secured devices and the ease of availability of tools in the Internet with which anyone can attack any system. The attackers or the developers of Malware usually lean towards blending of malware into the executable file, which makes it hard to detect the presence of malware in executable files. In this paper we have done experimental study on various algorithms of Machine Learning for detecting the presence of Malware in executable files. After testing Naïve Bayes, KNN and SVM, we found out that SVM was the most suited algorithm and had the accuracy of 94%. We then created a web application where the user could upload executable file and test the authenticity of the said executable file if it is a Malware file or a benign file.
Kurihara, Tatsuki, Togawa, Nozomu.  2021.  Hardware-Trojan Classification based on the Structure of Trigger Circuits Utilizing Random Forests. 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design (IOLTS). :1–4.
Recently, with the spread of Internet of Things (IoT) devices, embedded hardware devices have been used in a variety of everyday electrical items. Due to the increased demand for embedded hardware devices, some of the IC design and manufacturing steps have been outsourced to third-party vendors. Since malicious third-party vendors may insert malicious circuits, called hardware Trojans, into their products, developing an effective hardware Trojan detection method is strongly required. In this paper, we propose 25 hardware-Trojan features based on the structure of trigger circuits for machine-learning-based hardware Trojan detection. Combining the proposed features into 11 existing hardware-Trojan features, we totally utilize 36 hardware-Trojan features for classification. Then we classify the nets in an unknown netlist into a set of normal nets and Trojan nets based on the random-forest classifier. The experimental results demonstrate that the average true positive rate (TPR) becomes 63.6% and the average true negative rate (TNR) becomes 100.0%. They improve the average TPR by 14.7 points while keeping the average TNR compared to existing state-of-the-art methods. In particular, the proposed method successfully finds out Trojan nets in several benchmark circuits, which are not found by the existing method.
Ndichu, Samuel, Ban, Tao, Takahashi, Takeshi, Inoue, Daisuke.  2021.  A Machine Learning Approach to Detection of Critical Alerts from Imbalanced Multi-Appliance Threat Alert Logs. 2021 IEEE International Conference on Big Data (Big Data). :2119–2127.
The extraordinary number of alerts generated by network intrusion detection systems (NIDS) can desensitize security analysts tasked with incident response. Security information and event management systems (SIEMs) perform some rudimentary automation but cannot replicate the decision-making process of a skilled analyst. Machine learning and artificial intelligence (AI) can detect patterns in data with appropriate training. In practice, the majority of the alert data comprises false alerts, and true alerts form only a small proportion. Consequently, a naive engine that classifies all security alerts into the majority class can yield a superficial high accuracy close to 100%. Without any correction for the class imbalance, the false alerts will dominate algorithmic predictions resulting in poor generalization performance. We propose a machine-learning approach to address the class imbalance problem in multi-appliance security alert data and automate the security alert analysis process performed in security operations centers (SOCs). We first used the neighborhood cleaning rule (NCR) to identify and remove ambiguous, noisy, and redundant false alerts. Then, we applied the support vector machine synthetic minority oversampling technique (SVMSMOTE) to generate synthetic training true alerts. Finally, we fit and evaluated the decision tree and random forest classifiers. In the experiments, using alert data from eight security appliances, we demonstrated that the proposed method can significantly reduce the need for manual auditing, decreasing the number of uninspected alerts and achieving a performance of 99.524% in recall.
Aljubory, Nawaf, Khammas, Ban Mohammed.  2021.  Hybrid Evolutionary Approach in Feature Vector for Ransomware Detection. 2021 International Conference on Intelligent Technology, System and Service for Internet of Everything (ITSS-IoE). :1–6.

Ransomware is one of the most serious threats which constitute a significant challenge in the cybersecurity field. The cybercriminals use this attack to encrypts the victim's files or infect the victim's devices to demand ransom in exchange to restore access to these files and devices. The escalating threat of Ransomware to thousands of individuals and companies requires an urgent need for creating a system capable of proactively detecting and preventing ransomware. In this research, a new approach is proposed to detect and classify ransomware based on three machine learning algorithms (Random Forest, Support Vector Machines , and Näive Bayes). The features set was extracted directly from raw byte using static analysis technique of samples to improve the detection speed. To offer the best detection accuracy, CF-NCF (Class Frequency - Non-Class Frequency) has been utilized for generate features vectors. The proposed approach can differentiate between ransomware and goodware files with a detection accuracy of up to 98.33 percent.

Perrone, Paola, Flammini, Francesco, Setola, Roberto.  2021.  Machine Learning for Threat Recognition in Critical Cyber-Physical Systems. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :298–303.

Cybersecurity has become an emerging challenge for business information management and critical infrastructure protection in recent years. Artificial Intelligence (AI) has been widely used in different fields, but it is still relatively new in the area of Cyber-Physical Systems (CPS) security. In this paper, we provide an approach based on Machine Learning (ML) to intelligent threat recognition to enable run-time risk assessment for superior situation awareness in CPS security monitoring. With the aim of classifying malicious activity, several machine learning methods, such as k-nearest neighbours (kNN), Naïve Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT) and Random Forest (RF), have been applied and compared using two different publicly available real-world testbeds. The results show that RF allowed for the best classification performance. When used in reference industrial applications, the approach allows security control room operators to get notified of threats only when classification confidence will be above a threshold, hence reducing the stress of security managers and effectively supporting their decisions.