Biblio
Cybersecurity community is slowly leveraging Machine Learning (ML) to combat ever evolving threats. One of the biggest drivers for successful adoption of these models is how well domain experts and users are able to understand and trust their functionality. As these black-box models are being employed to make important predictions, the demand for transparency and explainability is increasing from the stakeholders.Explanations supporting the output of ML models are crucial in cyber security, where experts require far more information from the model than a simple binary output for their analysis. Recent approaches in the literature have focused on three different areas: (a) creating and improving explainability methods which help users better understand the internal workings of ML models and their outputs; (b) attacks on interpreters in white box setting; (c) defining the exact properties and metrics of the explanations generated by models. However, they have not covered, the security properties and threat models relevant to cybersecurity domain, and attacks on explainable models in black box settings.In this paper, we bridge this gap by proposing a taxonomy for Explainable Artificial Intelligence (XAI) methods, covering various security properties and threat models relevant to cyber security domain. We design a novel black box attack for analyzing the consistency, correctness and confidence security properties of gradient based XAI methods. We validate our proposed system on 3 security-relevant data-sets and models, and demonstrate that the method achieves attacker's goal of misleading both the classifier and explanation report and, only explainability method without affecting the classifier output. Our evaluation of the proposed approach shows promising results and can help in designing secure and robust XAI methods.
Research on the design of data center infrastructure is increasing, both from academia and industry, due to the rapid development of cloud-based applications such as search engines, social networks, and large-scale computing. On a large scale, data centers can consist of hundreds to thousands of servers that require systems with high-performance requirements and low downtime. To meet the network's needs in a dynamic data center, infrastructure of applications and services are growing. It takes a process of designing a network topology so that it can guarantee availability and security. One way to surmount this is by implementing the zero trust security model based on micro-segmentation. Zero trust is a security idea based on the principle of "never trust, always verify" in which no concepts of trust and untrust in network traffic. The zero trust security model implemented network traffic in the form of untrust. Micro-segmentation is a way to achieve zero trust by dividing a network into smaller logical segments to restrict the traffic. In this research, data center network performance based on software-defined networking with zero trust security model using micro-segmentation has been evaluated using a testbed simulation of Cisco Application Centric Infrastructure by measuring the round trip time, jitter, and packet loss during experiments. Performance evaluation results show that micro-segmentation adds an average round trip time of 4 μs and jitter of 11 μs without packet loss so that the security can be improved without significantly affecting network performance on the data center.
To our best knowledge, the p-sensitive k-anonymity model is a sophisticated model to resist linking attacks and homogeneous attacks in data publishing. However, if the distribution of sensitive values is skew, the model is difficult to defend against skew attacks and even faces sensitive attacks. In practice, the privacy requirements of different sensitive values are not always identical. The “one size fits all” unified privacy protection level may cause unnecessary information loss. To address these problems, the paper quantifies privacy requirements with the concept of IDF and concerns more about sensitive groups. Two enhanced anonymous models with personalized protection characteristic, that is, (p,αisg) -sensitive k-anonymity model and (pi,αisg)-sensitive k-anonymity model, are then proposed to resist skew attacks and sensitive attacks. Furthermore, two clustering algorithms with global search and local search are designed to implement our models. Experimental results show that the two enhanced models have outstanding advantages in better privacy at the expense of a little data utility.
The existing anonymized differential privacy model adopts a unified anonymity method, ignoring the difference of personal privacy, which may lead to the problem of excessive or insufficient protection of the original data [1]. Therefore, this paper proposes a personalized k-anonymity model for tuples (PKA) and proposes a differential privacy data publishing algorithm (DPPA) based on personalized anonymity, firstly based on the tuple personality factor set by the user in the original data set. The values are classified and the corresponding privacy protection relevance is calculated. Then according to the tuple personality factor classification value, the data set is clustered by clustering method with different anonymity, and the quasi-identifier attribute of each cluster is aggregated and noise-added to realize anonymized differential privacy; finally merge the subset to get the data set that meets the release requirements. In this paper, the correctness of the algorithm is analyzed theoretically, and the feasibility and effectiveness of the proposed algorithm are verified by comparison with similar algorithms.
Deep Neural Networks (DNNs) are susceptible to model stealing attacks, which allows a data-limited adversary with no knowledge of the training dataset to clone the functionality of a target model, just by using black-box query access. Such attacks are typically carried out by querying the target model using inputs that are synthetically generated or sampled from a surrogate dataset to construct a labeled dataset. The adversary can use this labeled dataset to train a clone model, which achieves a classification accuracy comparable to that of the target model. We propose "Adaptive Misinformation" to defend against such model stealing attacks. We identify that all existing model stealing attacks invariably query the target model with Out-Of-Distribution (OOD) inputs. By selectively sending incorrect predictions for OOD queries, our defense substantially degrades the accuracy of the attacker's clone model (by up to 40%), while minimally impacting the accuracy (\textbackslashtextless; 0.5%) for benign users. Compared to existing defenses, our defense has a significantly better security vs accuracy trade-off and incurs minimal computational overhead.
Recognizing the need for proactive analysis of cyber adversary behavior, this paper presents a new event-driven simulation model and implementation to reveal the efforts needed by attackers who have various entry points into a network. Unlike previous models which focus on the impact of attackers' actions on the defender's infrastructure, this work focuses on the attackers' strategies and actions. By operating on a request-response session level, our model provides an abstraction of how the network infrastructure reacts to access credentials the adversary might have obtained through a variety of strategies. We present the current capabilities of the simulator by showing three variants of Bronze Butler APT on a network with different user access levels.
Physical Unclonable Functions (PUFs) are a promising technology to secure low-cost devices. A PUF is a function whose values depend on the physical characteristics of the underlying hardware: the same PUF implemented on two identical integrated circuits will return different values. Thus, a PUF can be used as a unique fingerprint identifying one specific physical device among (apparently) identical copies that run the same firmware on the same hardware. PUFs, however, are tricky to implement, and a number of attacks have been reported in the literature, often due to wrong assumptions about the provided security guarantees and/or the attacker model. In this paper, we present the first mechanized symbolic model for PUFs that allows for precisely reasoning about their security with respect to a variegate set of attackers. We consider mutual authentication protocols based on different kinds of PUFs and model attackers that are able to access PUF values stored on servers, abuse the PUF APIs, model the PUF behavior and exploit error correction data to reproduce the PUF values. We prove security properties and we formally specify the capabilities required by the attacker to break them. Our analysis points out various subtleties, and allows for a systematic comparison between different PUF-based protocols. The mechanized models are easily extensible and can be automatically checked with the Tamarin prover.
As the number of data in various industries and government sectors is growing exponentially, the `7V' concept of big data aims to create a new value by indiscriminately collecting and analyzing information from various fields. At the same time as the ecosystem of the ICT industry arrives, big data utilization is treatened by the privacy attacks such as infringement due to the large amount of data. To manage and sustain the controllable privacy level, there need some recommended de-identification techniques. This paper exploits those de-identification processes and three types of commonly used privacy models. Furthermore, this paper presents use cases which can be adopted those kinds of technologies and future development directions.
Based on the analysis of the difficulties and pain points of privacy protection in the opening and sharing of government data, this paper proposes a new method for intelligent discovery and protection of structured and unstructured privacy data. Based on the improvement of the existing government data masking process, this method introduces the technologies of NLP and machine learning, studies the intelligent discovery of sensitive data, the automatic recommendation of masking algorithm and the full automatic execution following the improved masking process. In addition, the dynamic masking and static masking prototype with text and database as data source are designed and implemented with agent-based intelligent masking middleware. The results show that the recognition range and protection efficiency of government privacy data, especially government unstructured text have been significantly improved.