Visible to the public Biblio

Filters: Keyword is predictive security metrics  [Clear All Filters]
2020-04-03
Bhamidipati, Venkata Siva Vijayendra, Chan, Michael, Jain, Arpit, Murthy, Ashok Srinivasa, Chamorro, Derek, Muralidhar, Aniruddh Kamalapuram.  2019.  Predictive Proof of Metrics – a New Blockchain Consensus Protocol. 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS). :498—505.
We present a new consensus protocol for Blockchain ecosystems - PPoM - Predictive Proof of Metrics. First, we describe the motivation for PPoM - why we need it. Then, we outline its architecture, components, and operation. As part of this, we detail our reputation and reward based approach to bring about consensus in the Blockchain. We also address security and scalability for a PPoM based Blockchain, and discuss potential improvements for future work. Finally, we present measurements for our short term Provider Prediction engine.
2019-11-12
Katsini, Christina, Raptis, George E., Fidas, Christos, Avouris, Nikolaos.  2018.  Towards Gaze-Based Quantification of the Security of Graphical Authentication Schemes. Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications. :17:1-17:5.

In this paper, we introduce a two-step method for estimating the strength of user-created graphical passwords based on the eye-gaze behaviour during password composition. First, the individuals' gaze patterns, represented by the unique fixations on each area of interest (AOI) and the total fixation duration per AOI, are calculated. Second, the gaze-based entropy of the individual is calculated. To investigate whether the proposed metric is a credible predictor of the password strength, we conducted two feasibility studies. Results revealed a strong positive correlation between the strength of the created passwords and the gaze-based entropy. Hence, we argue that the proposed gaze-based metric allows for unobtrusive prediction of the strength of the password a user is going to create and enables intervention to the password composition for helping users create stronger passwords.

Werner, Gordon, Okutan, Ahmet, Yang, Shanchieh, McConky, Katie.  2018.  Forecasting Cyberattacks as Time Series with Different Aggregation Granularity. 2018 IEEE International Symposium on Technologies for Homeland Security (HST). :1-7.

Cyber defense can no longer be limited to intrusion detection methods. These systems require malicious activity to enter an internal network before an attack can be detected. Having advanced, predictive knowledge of future attacks allow a potential victim to heighten security and possibly prevent any malicious traffic from breaching the network. This paper investigates the use of Auto-Regressive Integrated Moving Average (ARIMA) models and Bayesian Networks (BN) to predict future cyber attack occurrences and intensities against two target entities. In addition to incident count forecasting, categorical and binary occurrence metrics are proposed to better represent volume forecasts to a victim. Different measurement periods are used in time series construction to better model the temporal patterns unique to each attack type and target configuration, seeing over 86% improvement over baseline forecasts. Using ground truth aggregated over different measurement periods as signals, a BN is trained and tested for each attack type and the obtained results provided further evidence to support the findings from ARIMA. This work highlights the complexity of cyber attack occurrences; each subset has unique characteristics and is influenced by a number of potential external factors.

Zhang, Xian, Ben, Kerong, Zeng, Jie.  2018.  Cross-Entropy: A New Metric for Software Defect Prediction. 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :111-122.

Defect prediction is an active topic in software quality assurance, which can help developers find potential bugs and make better use of resources. To improve prediction performance, this paper introduces cross-entropy, one common measure for natural language, as a new code metric into defect prediction tasks and proposes a framework called DefectLearner for this process. We first build a recurrent neural network language model to learn regularities in source code from software repository. Based on the trained model, the cross-entropy of each component can be calculated. To evaluate the discrimination for defect-proneness, cross-entropy is compared with 20 widely used metrics on 12 open-source projects. The experimental results show that cross-entropy metric is more discriminative than 50% of the traditional metrics. Besides, we combine cross-entropy with traditional metric suites together for accurate defect prediction. With cross-entropy added, the performance of prediction models is improved by an average of 2.8% in F1-score.

Vizarreta, Petra, Sakic, Ermin, Kellerer, Wolfgang, Machuca, Carmen Mas.  2019.  Mining Software Repositories for Predictive Modelling of Defects in SDN Controller. 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). :80-88.

In Software Defined Networking (SDN) control plane of forwarding devices is concentrated in the SDN controller, which assumes the role of a network operating system. Big share of today's commercial SDN controllers are based on OpenDaylight, an open source SDN controller platform, whose bug repository is publicly available. In this article we provide a first insight into 8k+ bugs reported in the period over five years between March 2013 and September 2018. We first present the functional components in OpenDaylight architecture, localize the most vulnerable modules and measure their contribution to the total bug content. We provide high fidelity models that can accurately reproduce the stochastic behaviour of bug manifestation and bug removal rates, and discuss how these can be used to optimize the planning of the test effort, and to improve the software release management. Finally, we study the correlation between the code internals, derived from the Git version control system, and software defect metrics, derived from Jira issue tracker. To the best of our knowledge, this is the first study to provide a comprehensive analysis of bug characteristics in a production grade SDN controller.

Ferenc, Rudolf, Heged\H us, Péter, Gyimesi, Péter, Antal, Gábor, Bán, Dénes, Gyimóthy, Tibor.  2019.  Challenging Machine Learning Algorithms in Predicting Vulnerable JavaScript Functions. 2019 IEEE/ACM 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE). :8-14.

The rapid rise of cyber-crime activities and the growing number of devices threatened by them place software security issues in the spotlight. As around 90% of all attacks exploit known types of security issues, finding vulnerable components and applying existing mitigation techniques is a viable practical approach for fighting against cyber-crime. In this paper, we investigate how the state-of-the-art machine learning techniques, including a popular deep learning algorithm, perform in predicting functions with possible security vulnerabilities in JavaScript programs. We applied 8 machine learning algorithms to build prediction models using a new dataset constructed for this research from the vulnerability information in public databases of the Node Security Project and the Snyk platform, and code fixing patches from GitHub. We used static source code metrics as predictors and an extensive grid-search algorithm to find the best performing models. We also examined the effect of various re-sampling strategies to handle the imbalanced nature of the dataset. The best performing algorithm was KNN, which created a model for the prediction of vulnerable functions with an F-measure of 0.76 (0.91 precision and 0.66 recall). Moreover, deep learning, tree and forest based classifiers, and SVM were competitive with F-measures over 0.70. Although the F-measures did not vary significantly with the re-sampling strategies, the distribution of precision and recall did change. No re-sampling seemed to produce models preferring high precision, while re-sampling strategies balanced the IR measures.

Wei, Shengjun, Zhong, Hao, Shan, Chun, Ye, Lin, Du, Xiaojiang, Guizani, Mohsen.  2018.  Vulnerability Prediction Based on Weighted Software Network for Secure Software Building. 2018 IEEE Global Communications Conference (GLOBECOM). :1-6.

To build a secure communications software, Vulnerability Prediction Models (VPMs) are used to predict vulnerable software modules in the software system before software security testing. At present many software security metrics have been proposed to design a VPM. In this paper, we predict vulnerable classes in a software system by establishing the system's weighted software network. The metrics are obtained from the nodes' attributes in the weighted software network. We design and implement a crawler tool to collect all public security vulnerabilities in Mozilla Firefox. Based on these data, the prediction model is trained and tested. The results show that the VPM based on weighted software network has a good performance in accuracy, precision, and recall. Compared to other studies, it shows that the performance of prediction has been improved greatly in Pr and Re.

Zhang, Tianwei, Zhang, Yinqian, Lee, Ruby B..  2018.  Analyzing Cache Side Channels Using Deep Neural Networks. Proceedings of the 34th Annual Computer Security Applications Conference. :174-186.

Cache side-channel attacks aim to breach the confidentiality of a computer system and extract sensitive secrets through CPU caches. In the past years, different types of side-channel attacks targeting a variety of cache architectures have been demonstrated. Meanwhile, different defense methods and systems have also been designed to mitigate these attacks. However, quantitatively evaluating the effectiveness of these attacks and defenses has been challenging. We propose a generic approach to evaluating cache side-channel attacks and defenses. Specifically, our method builds a deep neural network with its inputs as the adversary's observed information, and its outputs as the victim's execution traces. By training the neural network, the relationship between the inputs and outputs can be automatically discovered. As a result, the prediction accuracy of the neural network can serve as a metric to quantify how much information the adversary can obtain correctly, and how effective a defense solution is in reducing the information leakage under different attack scenarios. Our evaluation suggests that the proposed method can effectively evaluate different attacks and defenses.

2019-07-01
Clemente, C. J., Jaafar, F., Malik, Y..  2018.  Is Predicting Software Security Bugs Using Deep Learning Better Than the Traditional Machine Learning Algorithms? 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :95–102.

Software insecurity is being identified as one of the leading causes of security breaches. In this paper, we revisited one of the strategies in solving software insecurity, which is the use of software quality metrics. We utilized a multilayer deep feedforward network in examining whether there is a combination of metrics that can predict the appearance of security-related bugs. We also applied the traditional machine learning algorithms such as decision tree, random forest, naïve bayes, and support vector machines and compared the results with that of the Deep Learning technique. The results have successfully demonstrated that it was possible to develop an effective predictive model to forecast software insecurity based on the software metrics and using Deep Learning. All the models generated have shown an accuracy of more than sixty percent with Deep Learning leading the list. This finding proved that utilizing Deep Learning methods and a combination of software metrics can be tapped to create a better forecasting model thereby aiding software developers in predicting security bugs.

Meryem, Amar, Samira, Douzi, Bouabid, El Ouahidi.  2018.  Enhancing Cloud Security Using Advanced MapReduce K-means on Log Files. Proceedings of the 2018 International Conference on Software Engineering and Information Management. :63–67.

Many customers ranked cloud security as a major challenge that threaten their work and reduces their trust on cloud service's provider. Hence, a significant improvement is required to establish better adaptations of security measures that suit recent technologies and especially distributed architectures. Considering the meaningful recorded data in cloud generated log files, making analysis on them, mines insightful value about hacker's activities. It identifies malicious user behaviors and predicts new suspected events. Not only that, but centralizing log files, prevents insiders from causing damage to system. In this paper, we proposed to take away sensitive log files into a single server provider and combining both MapReduce programming and k-means on the same algorithm to cluster observed events into classes having similar features. To label unknown user behaviors and predict new suspected activities this approach considers cosine distances and deviation metrics.

Pope, Aaron Scott, Morning, Robert, Tauritz, Daniel R., Kent, Alexander D..  2018.  Automated Design of Network Security Metrics. Proceedings of the Genetic and Evolutionary Computation Conference Companion. :1680–1687.

Many abstract security measurements are based on characteristics of a graph that represents the network. These are typically simple and quick to compute but are often of little practical use in making real-world predictions. Practical network security is often measured using simulation or real-world exercises. These approaches better represent realistic outcomes but can be costly and time-consuming. This work aims to combine the strengths of these two approaches, developing efficient heuristics that accurately predict attack success. Hyper-heuristic machine learning techniques, trained on network attack simulation training data, are used to produce novel graph-based security metrics. These low-cost metrics serve as an approximation for simulation when measuring network security in real time. The approach is tested and verified using a simulation based on activity from an actual large enterprise network. The results demonstrate the potential of using hyper-heuristic techniques to rapidly evolve and react to emerging cybersecurity threats.

2017-12-28
Stuckman, J., Walden, J., Scandariato, R..  2017.  The Effect of Dimensionality Reduction on Software Vulnerability Prediction Models. IEEE Transactions on Reliability. 66:17–37.

Statistical prediction models can be an effective technique to identify vulnerable components in large software projects. Two aspects of vulnerability prediction models have a profound impact on their performance: 1) the features (i.e., the characteristics of the software) that are used as predictors and 2) the way those features are used in the setup of the statistical learning machinery. In a previous work, we compared models based on two different types of features: software metrics and term frequencies (text mining features). In this paper, we broaden the set of models we compare by investigating an array of techniques for the manipulation of said features. These techniques fall under the umbrella of dimensionality reduction and have the potential to improve the ability of a prediction model to localize vulnerabilities. We explore the role of dimensionality reduction through a series of cross-validation and cross-project prediction experiments. Our results show that in the case of software metrics, a dimensionality reduction technique based on confirmatory factor analysis provided an advantage when performing cross-project prediction, yielding the best F-measure for the predictions in five out of six cases. In the case of text mining, feature selection can make the prediction computationally faster, but no dimensionality reduction technique provided any other notable advantage.

Noureddine, M. A., Marturano, A., Keefe, K., Bashir, M., Sanders, W. H..  2017.  Accounting for the Human User in Predictive Security Models. 2017 IEEE 22nd Pacific Rim International Symposium on Dependable Computing (PRDC). :329–338.

Given the growing sophistication of cyber attacks, designing a perfectly secure system is not generally possible. Quantitative security metrics are thus needed to measure and compare the relative security of proposed security designs and policies. Since the investigation of security breaches has shown a strong impact of human errors, ignoring the human user in computing these metrics can lead to misleading results. Despite this, and although security researchers have long observed the impact of human behavior on system security, few improvements have been made in designing systems that are resilient to the uncertainties in how humans interact with a cyber system. In this work, we develop an approach for including models of user behavior, emanating from the fields of social sciences and psychology, in the modeling of systems intended to be secure. We then illustrate how one of these models, namely general deterrence theory, can be used to study the effectiveness of the password security requirements policy and the frequency of security audits in a typical organization. Finally, we discuss the many challenges that arise when adopting such a modeling approach, and then present our recommendations for future work.

Stanić, B., Afzal, W..  2017.  Process Metrics Are Not Bad Predictors of Fault Proneness. 2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). :493–499.

The correct prediction of faulty modules or classes has a number of advantages such as improving the quality of software and assigning capable development resources to fix such faults. There have been different kinds of fault/defect prediction models proposed in literature, but a great majority of them makes use of static code metrics as independent variables for making predictions. Recently, process metrics have gained a considerable attention as alternative metrics to use for making trust-worthy predictions. The objective of this paper is to investigate different combinations of static code and process metrics for evaluating fault prediction performance. We have used publicly available data sets, along with a frequently used classifier, Naive Bayes, to run our experiments. We have, both statistically and visually, analyzed our experimental results. The statistical analysis showed evidence against any significant difference in fault prediction performances for a variety of different combinations of metrics. This reinforced earlier research results that process metrics are as good as predictors of fault proneness as static code metrics. Furthermore, the visual inspection of box plots revealed that the best set of metrics for fault prediction is a mix of both static code and process metrics. We also presented evidence in support of some process metrics being more discriminating than others and thus making them as good predictors to use.

Boucher, A., Badri, M..  2017.  Predicting Fault-Prone Classes in Object-Oriented Software: An Adaptation of an Unsupervised Hybrid SOM Algorithm. 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). :306–317.

Many fault-proneness prediction models have been proposed in literature to identify fault-prone code in software systems. Most of the approaches use fault data history and supervised learning algorithms to build these models. However, since fault data history is not always available, some approaches also suggest using semi-supervised or unsupervised fault-proneness prediction models. The HySOM model, proposed in literature, uses function-level source code metrics to predict fault-prone functions in software systems, without using any fault data. In this paper, we adapt the HySOM approach for object-oriented software systems to predict fault-prone code at class-level granularity using object-oriented source code metrics. This adaptation makes it easier to prioritize the efforts of the testing team as unit tests are often written for classes in object-oriented software systems, and not for methods. Our adaptation also generalizes one main element of the HySOM model, which is the calculation of the source code metrics threshold values. We conducted an empirical study using 12 public datasets. Results show that the adaptation of the HySOM model for class-level fault-proneness prediction improves the consistency and the performance of the model. We additionally compared the performance of the adapted model to supervised approaches based on the Naive Bayes Network, ANN and Random Forest algorithms.

Poon, W. N., Bennin, K. E., Huang, J., Phannachitta, P., Keung, J. W..  2017.  Cross-Project Defect Prediction Using a Credibility Theory Based Naive Bayes Classifier. 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). :434–441.

Several defect prediction models proposed are effective when historical datasets are available. Defect prediction becomes difficult when no historical data exist. Cross-project defect prediction (CPDP), which uses projects from other sources/companies to predict the defects in the target projects proposed in recent studies has shown promising results. However, the performance of most CPDP approaches are still beyond satisfactory mainly due to distribution mismatch between the source and target projects. In this study, a credibility theory based Naïve Bayes (CNB) classifier is proposed to establish a novel reweighting mechanism between the source projects and target projects so that the source data could simultaneously adapt to the target data distribution and retain its own pattern. Our experimental results show that the feasibility of the novel algorithm design and demonstrate the significant improvement in terms of the performance metrics considered achieved by CNB over other CPDP approaches.

Sultana, K. Z., Williams, B. J..  2017.  Evaluating micro patterns and software metrics in vulnerability prediction. 2017 6th International Workshop on Software Mining (SoftwareMining). :40–47.

Software security is an important aspect of ensuring software quality. Early detection of vulnerable code during development is essential for the developers to make cost and time effective software testing. The traditional software metrics are used for early detection of software vulnerability, but they are not directly related to code constructs and do not specify any particular granularity level. The goal of this study is to help developers evaluate software security using class-level traceable patterns called micro patterns to reduce security risks. The concept of micro patterns is similar to design patterns, but they can be automatically recognized and mined from source code. If micro patterns can better predict vulnerable classes compared to traditional software metrics, they can be used in developing a vulnerability prediction model. This study explores the performance of class-level patterns in vulnerability prediction and compares them with traditional class-level software metrics. We studied security vulnerabilities as reported for one major release of Apache Tomcat, Apache Camel and three stand-alone Java web applications. We used machine learning techniques for predicting vulnerabilities using micro patterns and class-level metrics as features. We found that micro patterns have higher recall in detecting vulnerable classes than the software metrics.

Sultana, K. Z..  2017.  Towards a software vulnerability prediction model using traceable code patterns and software metrics. 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). :1022–1025.

Software security is an important aspect of ensuring software quality. The goal of this study is to help developers evaluate software security using traceable patterns and software metrics during development. The concept of traceable patterns is similar to design patterns but they can be automatically recognized and extracted from source code. If these patterns can better predict vulnerable code compared to traditional software metrics, they can be used in developing a vulnerability prediction model to classify code as vulnerable or not. By analyzing and comparing the performance of traceable patterns with metrics, we propose a vulnerability prediction model. This study explores the performance of some code patterns in vulnerability prediction and compares them with traditional software metrics. We use the findings to build an effective vulnerability prediction model. We evaluate security vulnerabilities reported for Apache Tomcat, Apache CXF and three stand-alone Java web applications. We use machine learning and statistical techniques for predicting vulnerabilities using traceable patterns and metrics as features. We found that patterns have a lower false negative rate and higher recall in detecting vulnerable code than the traditional software metrics.

2017-12-20
Abdelhamid, N., Thabtah, F., Abdel-jaber, H..  2017.  Phishing detection: A recent intelligent machine learning comparison based on models content and features. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :72–77.

In the last decade, numerous fake websites have been developed on the World Wide Web to mimic trusted websites, with the aim of stealing financial assets from users and organizations. This form of online attack is called phishing, and it has cost the online community and the various stakeholders hundreds of million Dollars. Therefore, effective counter measures that can accurately detect phishing are needed. Machine learning (ML) is a popular tool for data analysis and recently has shown promising results in combating phishing when contrasted with classic anti-phishing approaches, including awareness workshops, visualization and legal solutions. This article investigates ML techniques applicability to detect phishing attacks and describes their pros and cons. In particular, different types of ML techniques have been investigated to reveal the suitable options that can serve as anti-phishing tools. More importantly, we experimentally compare large numbers of ML techniques on real phishing datasets and with respect to different metrics. The purpose of the comparison is to reveal the advantages and disadvantages of ML predictive models and to show their actual performance when it comes to phishing attacks. The experimental results show that Covering approach models are more appropriate as anti-phishing solutions, especially for novice users, because of their simple yet effective knowledge bases in addition to their good phishing detection rate.

2017-04-11
Akond Rahman, Priysha Pradhan, Asif Parthoϕ, Laurie Williams.  2017.  Predicting Android Application Security and Privacy Risk With Static Code Metrics. 4th IEEE/ACM International Conference on Mobile Software Engineering and Systems.

Android applications pose security and privacy risks for end-users. These risks are often quantified by performing dynamic analysis and permission analysis of the Android applications after release. Prediction of security and privacy risks associated with Android applications at early stages of application development, e.g. when the developer (s) are
writing the code of the application, might help Android application developers in releasing applications to end-users that have less security and privacy risk. The goal of this paper
is to aid Android application developers in assessing the security and privacy risk associated with Android applications by using static code metrics as predictors. In our paper, we consider security and privacy risk of Android application as how susceptible the application is to leaking private information of end-users and to releasing vulnerabilities. We investigate how effectively static code metrics that are extracted from the source code of Android applications, can be used to predict security and privacy risk of Android applications. We collected 21 static code metrics of 1,407 Android applications, and use the collected static code metrics to predict security and privacy risk of the applications. As the oracle of security and privacy risk, we used Androrisk, a tool that quantifies the amount of security and privacy risk of an Android application using analysis of Android permissions and dynamic analysis. To accomplish our goal, we used statistical learners such as, radial-based support vector machine (r-SVM). For r-SVM, we observe a precision of 0.83. Findings from our paper suggest that with proper selection of static code metrics, r-SVM can be used effectively to predict security and privacy risk of Android applications

2016-11-16