Biblio
With the exponential hike in cyber threats, organizations are now striving for better data mining techniques in order to analyze security logs received from their IT infrastructures to ensure effective and automated cyber threat detection. Machine Learning (ML) based analytics for security machine data is the next emerging trend in cyber security, aimed at mining security data to uncover advanced targeted cyber threats actors and minimizing the operational overheads of maintaining static correlation rules. However, selection of optimal machine learning algorithm for security log analytics still remains an impeding factor against the success of data science in cyber security due to the risk of large number of false-positive detections, especially in the case of large-scale or global Security Operations Center (SOC) environments. This fact brings a dire need for an efficient machine learning based cyber threat detection model, capable of minimizing the false detection rates. In this paper, we are proposing optimal machine learning algorithms with their implementation framework based on analytical and empirical evaluations of gathered results, while using various prediction, classification and forecasting algorithms.
The power outages of the last couple of years around the world introduce the indispensability of technological development to improve the traditional power grids. Early warnings of imminent failures represent one of the major required improvements. Costly blackouts throughout the world caused by the different severe incidents in traditional power grids have motivated researchers to diagnose and investigate previous blackouts and propose a prediction model that enables to prevent power outages. Although, in the new generation of power grid, the smart grid's (SG) real time data can be used from smart meters (SMs) and phasor measurement unit sensors (PMU) to prevent blackout, it demands high reliability and stability against power outages. This paper implements a proactive prediction model based on deep-belief networks that can predict imminent blackout. The proposed model is evaluated on a real smart grid dataset. Promising results are reported in the case study.
Customers need to know how reliable a new release is, and whether or not the new release has substantially different, either better or worse, reliability than the one currently in production. Customers are demanding quantitative evidence, based on pre-release metrics, to help them decide whether or not to upgrade (and thereby offer new features and capabilities to their customers). Finding ways to estimate future reliability performance is not easy - we have evaluated many prerelease development and test metrics in search of reliability predictors that are sufficiently accurate and also apply to a broad range of software products. This paper describes a successful model that has resulted from these efforts, and also presents both a functional extension and a further conceptual simplification of the extended model that enables us to better communicate key release information to internal stakeholders and customers, without sacrificing predictive accuracy or generalizability. Work remains to be done, but the results of the original model, the extended model, and the simplified version are encouraging and are currently being applied across a range of products and releases. To evaluate whether or not these early predictions are accurate, and also to compare releases that are available to customers, we use a field software reliability assessment mechanism that incorporates two types of customer experience metrics: field bug encounters normalized by usage, and field bug counts, also normalized by usage. Our 'release-overrelease' strategy combines the 'maturity assessment' component (i.e., estimating reliability prior to release to the field) and the 'reliability assessment' component (i.e., gauging actual reliability after release to the field). This overall approach enables us to both predict reliability and compare reliability results for recent releases for a product.
The correct prediction of faulty modules or classes has a number of advantages such as improving the quality of software and assigning capable development resources to fix such faults. There have been different kinds of fault/defect prediction models proposed in literature, but a great majority of them makes use of static code metrics as independent variables for making predictions. Recently, process metrics have gained a considerable attention as alternative metrics to use for making trust-worthy predictions. The objective of this paper is to investigate different combinations of static code and process metrics for evaluating fault prediction performance. We have used publicly available data sets, along with a frequently used classifier, Naive Bayes, to run our experiments. We have, both statistically and visually, analyzed our experimental results. The statistical analysis showed evidence against any significant difference in fault prediction performances for a variety of different combinations of metrics. This reinforced earlier research results that process metrics are as good as predictors of fault proneness as static code metrics. Furthermore, the visual inspection of box plots revealed that the best set of metrics for fault prediction is a mix of both static code and process metrics. We also presented evidence in support of some process metrics being more discriminating than others and thus making them as good predictors to use.
Android applications pose security and privacy risks for end-users. These risks are often quantified by performing dynamic analysis and permission analysis of the Android applications after release. Prediction of security and privacy risks associated with Android applications at early stages of application development, e.g. when the developer (s) are writing the code of the application, might help Android application developers in releasing applications to end-users that have less security and privacy risk. The goal of this paper is to aid Android application developers in assessing the security and privacy risk associated with Android applications by using static code metrics as predictors. In our paper, we consider security and privacy risk of Android application as how susceptible the application is to leaking private information of end-users and to releasing vulnerabilities. We investigate how effectively static code metrics that are extracted from the source code of Android applications, can be used to predict security and privacy risk of Android applications. We collected 21 static code metrics of 1,407 Android applications, and use the collected static code metrics to predict security and privacy risk of the applications. As the oracle of security and privacy risk, we used Androrisk, a tool that quantifies the amount of security and privacy risk of an Android application using analysis of Android permissions and dynamic analysis. To accomplish our goal, we used statistical learners such as, radial-based support vector machine (r-SVM). For r-SVM, we observe a precision of 0.83. Findings from our paper suggest that with proper selection of static code metrics, r-SVM can be used effectively to predict security and privacy risk of Android applications.
Embedded systems must address a multitude of potentially conflicting design constraints such as resiliency, energy, heat, cost, performance, security, etc., all in the face of highly dynamic operational behaviors and environmental conditions. By incorporating elements of intelligence, the hope is that the resulting “smart” embedded systems will function correctly and within desired constraints in spite of highly dynamic changes in the applications and the environment, as well as in the underlying software/hardware platforms. Since terms related to “smartness” (e.g., self-awareness, self-adaptivity, and autonomy) have been used loosely in many software and hardware computing contexts, we first present a taxonomy of “self-x” terms and use this taxonomy to relate major “smart” software and hardware computing efforts. A major attribute for smart embedded systems is the notion of self-awareness that enables an embedded system to monitor its own state and behavior, as well as the external environment, so as to adapt intelligently. Toward this end, we use a System-on-Chip perspective to show how the CyberPhysical System-on-Chip (CPSoC) exemplar platform achieves self-awareness through a combination of cross-layer sensing, actuation, self-aware adaptations, and online learning. We conclude with some thoughts on open challenges and research directions.
Vulnerabilities usually represents the risk level of software, and it is of high value to forecast vulnerabilities so as to evaluate the security level of software. Current researches mainly focus on predicting the number of vulnerabilities or the occurrence time of vulnerabilities, however, to our best knowledge, there are no other researches focusing on the prediction of vulnerabilities' severity, which we think is an important aspect reflecting vulnerabilities and software security. To compensate for this deficiency, we borrows the grey model GM(1,1) from grey system theory to forecast the severity of vulnerabilities. The experiment is carried on the real data collected from CVE and proves the feasibility of our predicting method.
Compressed sensing (CS) or compressive sampling deals with reconstruction of signals from limited observations/ measurements far below the Nyquist rate requirement. This is essential in many practical imaging system as sampling at Nyquist rate may not always be possible due to limited storage facility, slow sampling rate or the measurements are extremely expensive e.g. magnetic resonance imaging (MRI). Mathematically, CS addresses the problem for finding out the root of an unknown distribution comprises of unknown as well as known observations. Robbins-Monro (RM) stochastic approximation, a non-parametric approach, is explored here as a solution to CS reconstruction problem. A distance based linear prediction using the observed measurements is done to obtain the unobserved samples followed by random noise addition to act as residual (prediction error). A spatial domain adaptive Wiener filter is then used to diminish the noise and to reveal the new features from the degraded observations. Extensive simulation results highlight the relative performance gain over the existing work.
This paper examines security faults/vulnerabilities reported for Fedora. Results indicate that, at least in some situations, fault roughly constant may be used to guide estimation of residual vulnerabilities in an already released product, as well as possibly guide testing of the next version of the product.