Biblio
Machine-learning solutions are successfully adopted in multiple contexts but the application of these techniques to the cyber security domain is complex and still immature. Among the many open issues that affect security systems based on machine learning, we concentrate on adversarial attacks that aim to affect the detection and prediction capabilities of machine-learning models. We consider realistic types of poisoning and evasion attacks targeting security solutions devoted to malware, spam and network intrusion detection. We explore the possible damages that an attacker can cause to a cyber detector and present some existing and original defensive techniques in the context of intrusion detection systems. This paper contains several performance evaluations that are based on extensive experiments using large traffic datasets. The results highlight that modern adversarial attacks are highly effective against machine-learning classifiers for cyber detection, and that existing solutions require improvements in several directions. The paper paves the way for more robust machine-learning-based techniques that can be integrated into cyber security platforms.
Deep neural networks are widely used in many walks of life. Techniques such as transfer learning enable neural networks pre-trained on certain tasks to be retrained for a new duty, often with much less data. Users have access to both pre-trained model parameters and model definitions along with testing data but have either limited access to training data or just a subset of it. This is risky for system-critical applications, where adversarial information can be maliciously included during the training phase to attack the system. Determining the existence and level of attack in a model is challenging. In this paper, we present evidence on how adversarially attacking training data increases the boundary of model parameters using as an example of a CNN model and the MNIST data set as a test. This expansion is due to new characteristics of the poisonous data that are added to the training data. Approaching the problem from the feature space learned by the network provides a relation between them and the possible parameters taken by the model on the training phase. An algorithm is proposed to determine if a given network was attacked in the training by comparing the boundaries of parameters distribution on intermediate layers of the model estimated by using the Maximum Entropy Principle and the Variational inference approach.
Federated learning is a novel distributed learning framework, where the deep learning model is trained in a collaborative manner among thousands of participants. The shares between server and participants are only model parameters, which prevent the server from direct access to the private training data. However, we notice that the federated learning architecture is vulnerable to an active attack from insider participants, called poisoning attack, where the attacker can act as a benign participant in federated learning to upload the poisoned update to the server so that he can easily affect the performance of the global model. In this work, we study and evaluate a poisoning attack in federated learning system based on generative adversarial nets (GAN). That is, an attacker first acts as a benign participant and stealthily trains a GAN to mimic prototypical samples of the other participants' training set which does not belong to the attacker. Then these generated samples will be fully controlled by the attacker to generate the poisoning updates, and the global model will be compromised by the attacker with uploading the scaled poisoning updates to the server. In our evaluation, we show that the attacker in our construction can successfully generate samples of other benign participants using GAN and the global model performs more than 80% accuracy on both poisoning tasks and main tasks.
Short-term load forecasting systems for power grids have demonstrated high accuracy and have been widely employed for commercial use. However, classic load forecasting systems, which are based on statistical methods, are subject to vulnerability from training data poisoning. In this paper, we demonstrate a data poisoning strategy that effectively corrupts the forecasting model even in the presence of outlier detection. To the best of our knowledge, poisoning attack on short-term load forecasting with outlier detection has not been studied in previous works. Our method applies to several forecasting models, including the most widely-adapted and best-performing ones, such as multiple linear regression (MLR) and neural network (NN) models. Starting with the MLR model, we develop a novel closed-form solution to quickly estimate the new MLR model after a round of data poisoning without retraining. We then employ line search and simulated annealing to find the poisoning attack solution. Furthermore, we use the MLR attacking solution to generate a numerical solution for other models, such as NN. The effectiveness of our algorithm has been tested on the Global Energy Forecasting Competition (GEFCom2012) data set with the presence of outlier detection.
Recently, the novel networking technology Software-Defined Networking(SDN) and Service Function Chaining(SFC) are rapidly growing, and security issues are also emerging for SDN and SFC. However, the research about security and safety on a novel networking environment is still unsatisfactory, and the vulnerabilities have been revealed continuously. Among these security issues, this paper addresses the ARP Poisoning attack to exploit SFC vulnerability, and proposes a method to defend the attack. The proposed method recognizes the repetitive ARP reply which is a feature of ARP Poisoning attack, and detects ARP Poisoning attack. The proposed method overcomes the limitations of the existing detection methods. The proposed method also detects the presence of an attack more accurately.
This Innovate Practice full paper presents a cloud-based personalized learning lab platform. Personalized learning is gaining popularity in online computer science education due to its characteristics of pacing the learning progress and adapting the instructional approach to each individual learner from a diverse background. Among various instructional methods in computer science education, hands-on labs have unique requirements of understanding learner's behavior and assessing learner's performance for personalization. However, it is rarely addressed in existing research. In this paper, we propose a personalized learning platform called ThoTh Lab specifically designed for computer science hands-on labs in a cloud environment. ThoTh Lab can identify the learning style from student activities and adapt learning material accordingly. With the awareness of student learning styles, instructors are able to use techniques more suitable for the specific student, and hence, improve the speed and quality of the learning process. With that in mind, ThoTh Lab also provides student performance prediction, which allows the instructors to change the learning progress and take other measurements to help the students timely. For example, instructors may provide more detailed instructions to help slow starters, while assigning more challenging labs to those quick learners in the same class. To evaluate ThoTh Lab, we conducted an experiment and collected data from an upper-division cybersecurity class for undergraduate students at Arizona State University in the US. The results show that ThoTh Lab can identify learning style with reasonable accuracy. By leveraging the personalized lab platform for a senior level cybersecurity course, our lab-use study also shows that the presented solution improves students engagement with better understanding of lab assignments, spending more effort on hands-on projects, and thus greatly enhancing learning outcomes.
This Innovate Practice Work in Progress paper is about education on Cybersecurity, which is essential in training of innovative talents in the era of the Internet. Besides knowledge and skills, it is important as well to enhance the students' awareness of cybersecurity in daily life. Considering that contactless smart cards are common and widely used in various areas, one basic and two advanced contactless smart card experiments were designed innovatively and assigned to junior students in 3-people groups in an introductory cybersecurity summer course. The experimental principles, facilities, contents and arrangement are introduced successively. Classroom tests were managed before and after the experiments, and a box and whisker plot is used to describe the distributions of the scores in both tests. The experimental output and student feedback implied the learning objectives were achieved through the problem-based, active and group learning experience during the experiments.
This Research Work in Progress paper presents a study on improving student learning performance in a virtual hands-on lab system in cybersecurity education. As the demand for cybersecurity-trained professionals rapidly increasing, virtual hands-on lab systems have been introduced into cybersecurity education as a tool to enhance students' learning. To improve learning in a virtual hands-on lab system, instructors need to understand: what learning activities are associated with students' learning performance in this system? What relationship exists between different learning activities? What instructors can do to improve learning outcomes in this system? However, few of these questions has been studied for using virtual hands-on lab in cybersecurity education. In this research, we present our recent findings by identifying that two learning activities are positively associated with students' learning performance. Notably, the learning activity of reading lab materials (p \textbackslashtextless; 0:01) plays a more significant role in hands-on learning than the learning activity of working on lab tasks (p \textbackslashtextless; 0:05) in cybersecurity education.In addition, a student, who spends longer time on reading lab materials, may work longer time on lab tasks (p \textbackslashtextless; 0:01).
Cybersecurity education is a pressing need, when computer systems and mobile devices are ubiquitous and so are the associated threats. However, in the teaching and learning process of cybersecurity, it is challenging when the students are from diverse disciplines with various academic backgrounds. In this project, a number of virtual laboratories are developed to facilitate the teaching and learning process in a cybersecurity course. The aim of the laboratories is to strengthen students’ understanding of cybersecurity topics, and to provide students hands-on experience of encountering various security threats. The results of this project indicate that virtual laboratories do facilitate the teaching and learning process in cybersecurity for diverse discipline students. Also, we observed that there is an underestimation of the difficulty of studying cybersecurity by the students due to the general image of cybersecurity in public, which had a negative impact on the student’s interest in studying cybersecurity.
Cybersecurity competitions have been shown to be an effective approach for promoting student engagement through active learning in cybersecurity. Players can gain hands-on experience in puzzle-based or capture-the-flag type tasks that promote learning. However, novice players with limited prior knowledge in cybersecurity usually found difficult to have a clue to solve a problem and get frustrated at the early stage. To enhance student engagement, it is important to study the experiences of novices to better understand their learning needs. To achieve this goal, we conducted a 4-month longitudinal case study which involves 11 undergraduate students participating in a college-level cybersecurity competition, National Cyber League (NCL) competition. The competition includes two individual games and one team game. Questionnaires and in-person interviews were conducted before and after each game to collect the players' feedback on their experience, learning challenges and needs, and information about their motivation, interests and confidence level. The collected data demonstrate that the primary concern going into these competitions stemmed from a lack of knowledge regarding cybersecurity concepts and tools. Players' interests and confidence can be increased by going through systematic training.
Cloud Storage Brokers (CSB) provide seamless and concurrent access to multiple Cloud Storage Services (CSS) while abstracting cloud complexities from end-users. However, this multi-cloud strategy faces several security challenges including enlarged attack surfaces, malicious insider threats, security complexities due to integration of disparate components and API interoperability issues. Novel security approaches are imperative to tackle these security issues. Therefore, this paper proposes CS-BAuditor, a novel cloud security system that continuously audits CSB resources, to detect malicious activities and unauthorized changes e.g. bucket policy misconfigurations, and remediates these anomalies. The cloud state is maintained via a continuous snapshotting mechanism thereby ensuring fault tolerance. We adopt the principles of chaos engineering by integrating BrokerMonkey, a component that continuously injects failure into our reference CSB system, CloudRAID. Hence, CSBAuditor is continuously tested for efficiency i.e. its ability to detect the changes injected by BrokerMonkey. CSBAuditor employs security metrics for risk analysis by computing severity scores for detected vulnerabilities using the Common Configuration Scoring System, thereby overcoming the limitation of insufficient security metrics in existing cloud auditing schemes. CSBAuditor has been tested using various strategies including chaos engineering failure injection strategies. Our experimental evaluation validates the efficiency of our approach against the aforementioned security issues with a detection and recovery rate of over 96 %.
Testing which is an indispensable part of software engineering is itself an art and science which emerged as a discipline over a period. On testing, if defects are found, testers diminish the risk by providing the awareness of defects and solutions to deal with them before release. If testing does not find any defects, testing assure that under certain conditions the system functions correctly. To guarantee that enough testing has been done, major risk areas need to be tested. We have to identify the risks, analyse and control them. We need to categorize the risk items to decide the extent of testing to be covered. Also, Implementation of structured metrics is lagging in software testing. Efficient metrics are necessary to evaluate, manage the testing process and make testing a part of engineering discipline. This paper proposes the usage of risk based testing using FMEA technique and provides an ideal set of metrics which provides a way to ensure effective testing process.
With widely applied in various fields, deep learning (DL) is becoming the key driving force in industry. Although it has achieved great success in artificial intelligence tasks, similar to traditional software, it has defects that, once it failed, unpredictable accidents and losses would be caused. In this paper, we propose a test cases generation technique based on an adversarial samples generation algorithm for image classification deep neural networks (DNNs), which can generate a large number of good test cases for the testing of DNNs, especially in case that test cases are insufficient. We briefly introduce our method, and implement the framework. We conduct experiments on some classic DNN models and datasets. We further evaluate the test set by using a coverage metric based on states of the DNN.
Software vulnerabilities often remain hidden until an attacker exploits the weak/insecure code. Therefore, testing the software from a vulnerability discovery perspective becomes challenging for developers if they do not inspect their code thoroughly (which is time-consuming). We propose that vulnerability prediction using certain software metrics can support the testing process by identifying vulnerable code-components (e.g., functions, classes, etc.). Once a code-component is predicted as vulnerable, the developers can focus their testing efforts on it, thereby avoiding the time/effort required for testing the entire application. The current paper presents a study that compares how software metrics perform as vulnerability predictors for software projects developed in two different languages (Java vs Python). The goal of this research is to analyze the vulnerability prediction performance of software metrics for different programming languages. We designed and conducted experiments on security vulnerabilities reported for three Java projects (Apache Tomcat 6, Tomcat 7, Apache CXF) and two Python projects (Django and Keystone). In this paper, we focus on a specific type of code component: Functions. We apply Machine Learning models for predicting vulnerable functions. Overall results show that software metrics-based vulnerability prediction is more useful for Java projects than Python projects (i.e., software metrics when used as features were able to predict Java vulnerable functions with a higher recall and precision compared to Python vulnerable functions prediction).