Biblio
Self-adaptive systems commonly operate in heterogeneous contexts and need to consider multiple quality attributes. Human stakeholders often express their quality preferences by defining utility functions, which are used by self-adaptive systems to automatically generate adaptation plans. However, the adaptation space of realistic systems is large and it is obscure how utility functions impact the generated adaptation behavior, as well as structural, behavioral, and quality constraints. Moreover, human stakeholders are often not aware of the underlying tradeoffs between quality attributes. To address this issue, we present an approach that uses machine learning techniques (dimensionality reduction, clustering, and decision tree learning) to explain the reasoning behind automated planning. Our approach focuses on the tradeoffs between quality attributes and how the choice of weights in utility functions results in different plans being generated. We help humans understand quality attribute tradeoffs, identify key decisions in adaptation behavior, and explore how differences in utility functions result in different adaptation alternatives. We present two systems to demonstrate the approach’s applicability and consider its potential application to 24 exemplar self-adaptive systems. Moreover, we describe our assessment of the tradeoff between the information reduction and the amount of explained variance retained by the results obtained with our approach.
With the advent of networking technologies and increasing network attacks, Intrusion Detection systems are apparently needed to stop attacks and malicious activities. Various frameworks and techniques have been developed to solve the problem of intrusion detection, still there is need for new frameworks as per the challenging scenario of enormous scale in data size and nature of attacks. Current IDS systems pose challenges on the throughput to work with high speed networks. In this paper we address the issue of high computational overhead of anomaly based IDS and propose the solution using discretization as a data preprocessing step which can drastically reduce the computation overhead. We propose method to provide near real time detection of attacks using only basic flow level features that can easily be extracted from network packets.
The detection of malicious HTTP(S) requests is a pressing concern in cyber security, in particular given the proliferation of HTTP-based (micro-)service architectures. In addition to rule-based systems for known attacks, anomaly detection has been shown to be a promising approach for unknown (zero-day) attacks. This article extends existing work by integrating outlier explanations for individual requests into an end-to-end pipeline. These end-to-end explanations reflect the internal working of the pipeline. Empirically, we show that found explanations coincide with manually labelled explanations for identified outliers, allowing security professionals to quickly identify and understand malicious requests.
Among the different types of malware, botnets are rising as the most genuine risk against cybersecurity as they give a stage to criminal operations (e.g., Distributed Denial of Service (DDOS) attacks, malware dispersal, phishing, and click fraud and identity theft). Existing botnet detection techniques work only on specific botnet Command and Control (C&C) protocols and lack in providing early-stage botnet detection. In this paper, we propose an approach for early-stage botnet detection. The proposed approach first selects the optimal features using feature selection techniques. Next, it feeds these features to machine learning classifiers to evaluate the performance of the botnet detection. Experiments reveals that the proposed approach efficiently classifies normal and malicious traffic at an early stage. The proposed approach achieves the accuracy of 99%, True Positive Rate (TPR) of 0.99 %, and False Positive Rate (FPR) of 0.007 % and provide an efficient detection rate in comparison with the existing approach.
Partitional Clustering Algorithm (PCA) on the Hadoop Distributed File System is to perform big data securities using the Perturbation Technique is the main idea of the proposed work. There are numerous clustering methods available that are used to categorize the information from the big data. PCA discovers the cluster based on the initial partition of the data. In this approach, it is possible to develop a security safeguarding of data that is impoverished to allow the calculations and communication. The performances were analyzed on Health Care database under the studies of various parameters like precision, accuracy, and F-score measure. The outcome of the results is to demonstrate that this method is used to decrease the complication in preserving privacy and better accuracy than that of the existing techniques.
Deep Learning is an area of Machine Learning research, which can be used to manipulate large amount of information in an intelligent way by using the functionality of computational intelligence. A deep learning system is a fully trainable system beginning from raw input to the final output of recognized objects. Feature selection is an important aspect of deep learning which can be applied for dimensionality reduction or attribute reduction and making the information more explicit and usable. Deep learning can build various learning models which can abstract unknown information by selecting a subset of relevant features. This property of deep learning makes it useful in analysis of highly complex information one which is present in intrusive data or information flowing with in a web system or a network which needs to be analyzed to detect anomalies. Our approach combines the intelligent ability of Deep Learning to build a smart Intrusion detection system.
Nowadays due to economic reasons most of the semiconductor companies prefer to outsource the manufacturing part of their designs to third fabrication foundries, the so-called fabs. Untrustworthy fabs can extract circuit blocks, the called intellectual properties (IPs), from the layouts and then pirate them. Such fabs are suspected of hardware Trojan (HT) threat in which malicious circuits are added to the layouts for sabotage objectives. HTs lead up to increase power consumption in HT-infected circuits. However, due to process variations, the power of HTs including few gates in million-gate circuits is not detectable in power consumption analysis (PCA). Thus, such circuits should be considered as a collection of small sub-circuits, and PCA must be individually performed for each one of them. In this article, we introduce an approach facilitating PCA-based HT detection methods. Concerning this approach, we propose a new logic locking method and algorithm. Logic locking methods and algorithm are usually employed against IP piracy. They modify circuits such that they do not correctly work without applying a correct key to. Our experiments at the gate level and post-synthesis show that the proposed locking method and algorithm increase the proportion of HT activity and consequently HT power to circuit power.
Sparse and low rank matrix decomposition is a method that has recently been developed for estimating different components of hyperspectral data. The rank component is capable of preserving global data structures of data, while a sparse component can select the discriminative information by preserving details. In order to take advantage of both, we present a novel decision fusion based on joint low rank and sparse component (DFJLRS) method for hyperspectral imagery in this paper. First, we analyzed the effects of different components on classification results. Then a novel method adopts a decision fusion strategy which combines a SVM classifier with the information provided by joint sparse and low rank components. With combination of the advantages, the proposed method is both representative and discriminative. The proposed algorithm is evaluated using several hyperspectral images when compared with traditional counterparts.
Designing a machine learning based network intrusion detection system (IDS) with high-dimensional features can lead to prolonged classification processes. This is while low-dimensional features can reduce these processes. Moreover, classification of network traffic with imbalanced class distributions has posed a significant drawback on the performance attainable by most well-known classifiers. With the presence of imbalanced data, the known metrics may fail to provide adequate information about the performance of the classifier. This study first uses Principal Component Analysis (PCA) as a feature dimensionality reduction approach. The resulting low-dimensional features are then used to build various classifiers such as Random Forest (RF), Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for designing an IDS. The experimental findings with low-dimensional features in binary and multi-class classification show better performance in terms of Detection Rate (DR), F-Measure, False Alarm Rate (FAR), and Accuracy. Furthermore, in this paper, we apply a Multi-Class Combined performance metric Combi ned Mc with respect to class distribution through incorporating FAR, DR, Accuracy, and class distribution parameters. In addition, we developed a uniform distribution based balancing approach to handle the imbalanced distribution of the minority class instances in the CICIDS2017 network intrusion dataset. We were able to reduce the CICIDS2017 dataset's feature dimensions from 81 to 10 using PCA, while maintaining a high accuracy of 99.6% in multi-class and binary classification.
False data injection is an on-going concern facing power system state estimation. In this work, a neural network is trained to detect the existence of false data in measurements. The proposed approach can make use of historical data, if available, by using them in the training sets of the proposed neural network model. However, the inputs of perceptron model in this work are the residual elements from the state estimation, which are highly correlated. Therefore, their dimension could be reduced by preserving the most informative features from the inputs. To this end, principal component analysis is used (i.e., a data preprocessing technique). This technique is especially efficient for highly correlated data sets, which is the case in power system measurements. The results of different perceptron models that are proposed for detection, are compared to a simple perceptron that produces identical result to the outlier detection scheme. For generating the training sets, state estimation was run for different false data on different measurements in 13-bus IEEE test system, and the residuals are saved as inputs of training sets. The testing results of the trained network show its good performance in detection of false data in measurements.
Localizing concurrency faults that occur in production is hard because, (1) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers to reproduce the failure; (2) it is often impractical to assume the availability of multiple failing executions to localize the faults using existing techniques; (3) it is challenging to search for buggy locations in an application given limited runtime data; and, (4) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which can not be handled by existing tools for diagnosing intra-process(thread-level) failures. To address these problems, we present SCMiner, a practical online bug diagnosis tool to help developers understand how a system-level concurrency fault happens based on the logs collected by the default system audit tools. SCMiner achieves online bug diagnosis to obviate the need for offline bug reproduction. SCMiner does not require code instrumentation on the production system or rely on the assumption of the availability of multiple failing executions. Specifically, after the system call traces are collected, SCMiner uses data mining and statistical anomaly detection techniques to identify the failure-inducing system call sequences. It then maps each abnormal sequence to specific application functions. We have conducted an empirical study on 19 real-world benchmarks. The results show that SCMiner is both effective and efficient at localizing system-level concurrency faults.