Biblio
Reliable operation of power systems is a primary challenge for the system operators. With the advancement in technology and grid automation, power systems are becoming more vulnerable to cyber-attacks. The main goal of adversaries is to take advantage of these vulnerabilities and destabilize the system. This paper describes a game-theoretic approach to attacker / defender modeling in power systems. In our models, the attacker can strategically identify the subset of substations that maximize damage when compromised. However, the defender can identify the critical subset of substations to protect in order to minimize the damage when an attacker launches a cyber-attack. The algorithms for these models are applied to the standard IEEE-14, 39, and 57 bus examples to identify the critical set of substations given an attacker and a defender budget.
Machine learning (ML) algorithms provide a good solution for many security sensitive applications, they themselves, however, face the threats of adversary attacks. As a key problem in machine learning, how to design robust feature selection algorithms against these attacks becomes a hot issue. The current researches on defending evasion attacks mainly focus on wrapped adversarial feature selection algorithm, i.e., WAFS, which is dependent on the classification algorithms, and time cost is very high for large-scale data. Since mRMR (minimum Redundancy and Maximum Relevance) algorithm is one of the most popular filter algorithms for feature selection without considering any classifier during feature selection process. In this paper, we propose a novel adversary-aware feature selection algorithm under filter model based on mRMR, named FAFS. The algorithm, on the one hand, takes the correlation between a single feature and a label, and the redundancy between features into account; on the other hand, when selecting features, it not only considers the generalization ability in the absence of attack, but also the robustness under attack. The performance of four algorithms, i.e., mRMR, TWFS (Traditional Wrapped Feature Selection algorithm), WAFS, and FAFS is evaluated on spam filtering and PDF malicious detection in the Perfect Knowledge attack scenarios. The experiment results show that FAFS has a better performance under evasion attacks with less time complexity, and comparable classification accuracy.
Quantum information exchange computer emulator is presented, which takes into consideration imperfections of real quantum channel such as noise and attenuation resulting in the necessity to increase number of photons in the impulse. The Qt Creator C++ program package provides evaluation of the ability to detect unauthorized access as well as an amount of information intercepted by intruder.
Machine learning (ML) models are often trained using private datasets that are very expensive to collect, or highly sensitive, using large amounts of computing power. The models are commonly exposed either through online APIs, or used in hardware devices deployed in the field or given to the end users. This provides an incentive for adversaries to steal these ML models as a proxy for gathering datasets. While API-based model exfiltration has been studied before, the theft and protection of machine learning models on hardware devices have not been explored as of now. In this work, we examine this important aspect of the design and deployment of ML models. We illustrate how an attacker may acquire either the model or the model architecture through memory probing, side-channels, or crafted input attacks, and propose (1) power-efficient obfuscation as an alternative to encryption, and (2) timing side-channel countermeasures.
Along with the rapid development of hardware security techniques, the revolutionary growth of countermeasures or attacking methods developed by intelligent and adaptive adversaries have significantly complicated the ability to create secure hardware systems. Thus, there is a critical need to (re)evaluate existing or new hardware security techniques against these state-of-the-art attacking methods. With this in mind, this paper presents a novel framework for incorporating active learning techniques into hardware security field. We demonstrate that active learning can significantly improve the learning efficiency of physical unclonable function (PUF) modeling attack, which samples the least confident and the most informative challenge-response pair (CRP) for training in each iteration. For example, our experimental results show that in order to obtain a prediction error below 4%, 2790 CRPs are required in passive learning, while only 811 CRPs are required in active learning. The sampling strategies and detailed applications of PUF modeling attack under various environmental conditions are also discussed. When the environment is very noisy, active learning may sample a large number of mislabeled CRPs and hence result in high prediction error. We present two methods to mitigate the contradiction between informative and noisy CRPs.
We explore methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN. Deep learning architectures are known to be vulnerable to adversarial examples, but previous work has focused on the application of adversarial examples to classification tasks. Deep generative models have recently become popular due to their ability to model input data distributions and generate realistic examples from those distributions. We present three classes of attacks on the VAE and VAE-GAN architectures and demonstrate them against networks trained on MNIST, SVHN and CelebA. Our first attack leverages classification-based adversaries by attaching a classifier to the trained encoder of the target generative model, which can then be used to indirectly manipulate the latent representation. Our second attack directly uses the VAE loss function to generate a target reconstruction image from the adversarial example. Our third attack moves beyond relying on classification or the standard loss for the gradient and directly optimizes against differences in source and target latent representations. We also motivate why an attacker might be interested in deploying such techniques against a target generative network.
Adversaries are conducting attack campaigns with increasing levels of sophistication. Additionally, with the prevalence of out-of-the-box toolkits that simplify attack operations during different stages of an attack campaign, multiple new adversaries and attack groups have appeared over the past decade. Characterizing the behavior and the modus operandi of different adversaries is critical in identifying the appropriate security maneuver to detect and mitigate the impact of an ongoing attack. To this end, in this paper, we study two characteristics of an adversary: Risk-averseness and Experience level. Risk-averse adversaries are more cautious during their campaign while fledgling adversaries do not wait to develop adequate expertise and knowledge before launching attack campaigns. One manifestation of these characteristics is through the adversary's choice and usage of attack tools. To detect these characteristics, we present multi-level machine learning (ML) models that use network data generated while under attack by different attack tools and usage patterns. In particular, for risk-averseness, we considered different configurations for scanning tools and trained the models in a testbed environment. The resulting model was used to predict the cautiousness of different red teams that participated in the Cyber Shield ‘16 exercise. The predictions matched the expected behavior of the red teams. For Experience level, we considered publicly-available remote access tools and usage patterns. We developed a Markov model to simulate usage patterns of attackers with different levels of expertise and through experiments on CyberVAN, we showed that the ML model has a high accuracy.
Motivated by the study of matrix elimination orderings in combinatorial scientific computing, we utilize graph sketching and local sampling to give a data structure that provides access to approximate fill degrees of a matrix undergoing elimination in polylogarithmic time per elimination and query. We then study the problem of using this data structure in the minimum degree algorithm, which is a widely-used heuristic for producing elimination orderings for sparse matrices by repeatedly eliminating the vertex with (approximate) minimum fill degree. This leads to a nearly-linear time algorithm for generating approximate greedy minimum degree orderings. Despite extensive studies of algorithms for elimination orderings in combinatorial scientific computing, our result is the first rigorous incorporation of randomized tools in this setting, as well as the first nearly-linear time algorithm for producing elimination orderings with provable approximation guarantees. While our sketching data structure readily works in the oblivious adversary model, by repeatedly querying and greedily updating itself, it enters the adaptive adversarial model where the underlying sketches become prone to failure due to dependency issues with their internal randomness. We show how to use an additional sampling procedure to circumvent this problem and to create an independent access sequence. Our technique for decorrelating interleaved queries and updates to this randomized data structure may be of independent interest.
Establishing a secret and reliable wireless communication is a challenging task that is of paramount importance. In this paper, we investigate the physical layer security of a legitimate transmission link between a user that assists an Intrusion Detection System (IDS) in detecting eavesdropping and jamming attacks in the presence of an adversary that is capable of conducting an eavesdropping or a jamming attack. The user is being faced by a challenge of whether to transmit, thus becoming vulnerable to an eavesdropping or a jamming attack, or to keep silent and consequently his/her transmission will be delayed. The adversary is also facing a challenge of whether to conduct an eavesdropping or a jamming attack that will not get him/her to be detected. We model the interactions between the user and the adversary as a two-state stochastic game. Explicit solutions characterize some properties while highlighting some interesting strategies that are being embraced by the user and the adversary. Results show that our proposed system outperform current systems in terms of communication secrecy.
High-accuracy localization is a prerequisite for many wireless applications. To obtain accurate location information, it is often required to share users' positional knowledge and this brings the risk of leaking location information to adversaries during the localization process. This paper develops a theory and algorithms for protecting location secrecy. In particular, we first introduce a location secrecy metric (LSM) for a general measurement model of an eavesdropper. Compared to previous work, the measurement model accounts for parameters such as channel conditions and time offsets in addition to the positions of users. We determine the expression of the LSM for typical scenarios and show how the LSM depends on the capability of an eavesdropper and the quality of the eavesdropper's measurement. Based on the insights gained from the analysis, we consider a case study in wireless localization network and develop an algorithm that diminish the eavesdropper's capabilities by exploiting the reciprocity of channels. Numerical results show that the proposed algorithm can effectively increase the LSM and protect location secrecy.
In this paper, machine learning attacks are performed on a novel hybrid delay based Arbiter Ring Oscillator PUF (AROPUF). The AROPUF exhibits improved results when compared to traditional Arbiter Physical Unclonable Function (APUF). The challenge-response pairs (CRPs) from both PUFs are fed to the multilayered perceptron model (MLP) with one hidden layer. The results show that the CRPs generated from the proposed AROPUF has more training and prediction errors when compared to the APUF, thus making it more difficult for the adversary to predict the CRPs.
In this paper, we present an algorithm for estimating the state of the power grid following a cyber-physical attack. We assume that an adversary attacks an area by: (i) disconnecting some lines within that area (failed lines), and (ii) obstructing the information from within the area to reach the control center. Given the phase angles of the buses outside the attacked area under the AC power flow model (before and after the attack), the algorithm estimates the phase angles of the buses and detects the failed lines inside the attacked area. The novelty of our approach is the transformation of the line failures detection problem, which is combinatorial in nature, to a convex optimization problem. As a result, our algorithm can detect any number of line failures in a running time that is independent of the number of failures and is solely dependent on the size of the network. To the best of our knowledge, this is the first convex relaxation for the problem of line failures detection using phase angle measurements under the AC power flow model. We evaluate the performance of our algorithm in the IEEE 118- and 300-bus systems, and show that it estimates the phase angles of the buses with less that 1% error, and can detect the line failures with 80% accuracy for single, double, and triple line failures.
The false data injection attack (FDIA) is a form of cyber-attack capable of affecting the secure and economic operation of the smart grid. With DC model-based state estimation, this paper analyzes ways of constructing a successful attacking vector to fulfill specific targets, i.e., pre-specified state variable target and pre-specified meter target according to the adversary's willingness. The grid operator's historical reading experiences on meters are considered as a constraint for the adversary to avoid being detected. Also from the viewpoint of the adversary, we propose to take full advantage of the dual concept of the coefficients in the topology matrix to handle with the problem that the adversary has no access to some meters. Effectiveness of the proposed method is validated by numerical experiments on the IEEE-14 benchmark system.
Machine learning and data mining algorithms typically assume that the training and testing data are sampled from the same fixed probability distribution; however, this violation is often violated in practice. The field of domain adaptation addresses the situation where this assumption of a fixed probability between the two domains is violated; however, the difference between the two domains (training/source and testing/target) may not be known a priori. There has been a recent thrust in addressing the problem of learning in the presence of an adversary, which we formulate as a problem of domain adaption to build a more robust classifier. This is because the overall security of classifiers and their preprocessing stages have been called into question with the recent findings of adversaries in a learning setting. Adversarial training (and testing) data pose a serious threat to scenarios where an attacker has the opportunity to ``poison'' the training or ``evade'' on the testing data set(s) in order to achieve something that is not in the best interest of the classifier. Recent work has begun to show the impact of adversarial data on several classifiers; however, the impact of the adversary on aspects related to preprocessing of data (i.e., dimensionality reduction or feature selection) has widely been ignored in the revamp of adversarial learning research. Furthermore, variable selection, which is a vital component to any data analysis, has been shown to be particularly susceptible under an attacker that has knowledge of the task. In this work, we explore avenues for learning resilient classification models in the adversarial learning setting by considering the effects of adversarial data and how to mitigate its effects through optimization. Our model forms a single convex optimization problem that uses the labeled training data from the source domain and known- weaknesses of the model for an adversarial component. We benchmark the proposed approach on synthetic data and show the trade-off between classification accuracy and skew-insensitive statistics.
Deregulated electricity markets rely on a two-settlement system consisting of day-ahead and real-time markets, across which electricity price is volatile. In such markets, locational marginal pricing is widely adopted to set electricity prices and manage transmission congestion. Locational marginal prices are vulnerable to measurement errors. Existing studies show that if the adversaries are omniscient, they can design profitable attack strategies without being detected by the residue-based bad data detectors. This paper focuses on a more realistic setting, in which the attackers have only partial and imperfect information due to their limited resources and restricted physical access to the grid. Specifically, the attackers are assumed to have uncertainties about the state of the grid, and the uncertainties are modeled stochastically. Based on this model, this paper offers a framework for characterizing the optimal stochastic guarantees for the effectiveness of the attacks and the associated pricing impacts.
Cooperative spectrum sensing is often necessary in cognitive radios systems to localize a transmitter by fusing the measurements from multiple sensing radios. However, revealing spectrum sensing information also generally leaks information about the location of the radio that made those measurements. We propose a protocol for performing cooperative spectrum sensing while preserving the privacy of the sensing radios. In this protocol, radios fuse sensing information through a distributed particle filter based on a tree structure. All sensing information is encrypted using public-key cryptography, and one of the radios serves as an anonymizer, whose role is to break the connection between the sensing radios and the public keys they use. We consider a semi-honest (honest-but-curious) adversary model in which there is at most a single adversary that is internal to the sensing network and complies with the specified protocol but wishes to determine information about the other participants. Under this scenario, an adversary may learn the sensing information of some of the radios, but it does not have any way to tie that information to a particular radio's identity. We test the performance of our proposed distributed, tree-based particle filter using physical measurements of FM broadcast stations.
This paper offers a new approach to modelling the effect of cyber-attacks on reliability of software used in industrial control applications. The model is based on the view that successful cyber-attacks introduce failure regions, which are not present in non-compromised software. The model is then extended to cover a fault tolerant architecture, such as the 1-out-of-2 software, popular for building industrial protection systems. The model is used to study the effectiveness of software maintenance policies such as patching and "cleansing" ("proactive recovery") under different adversary models ranging from independent attacks to sophisticated synchronized attacks on the channels. We demonstrate that the effect of attacks on reliability of diverse software significantly depends on the adversary model. Under synchronized attacks system reliability may be more than an order of magnitude worse than under independent attacks on the channels. These findings, although not surprising, highlight the importance of using an adequate adversary model in the assessment of how effective various cyber-security controls are.
Distributed Denial of Service (DDoS) attacks are some of the most persistent threats on the Internet today. The evolution of DDoS attacks calls for an in-depth analysis of those attacks. A better understanding of the attackers' behavior can provide insights to unveil patterns and strategies utilized by attackers. The prior art on the attackers' behavior analysis often falls in two aspects: it assumes that adversaries are static, and makes certain simplifying assumptions on their behavior, which often are not supported by real attack data. In this paper, we take a data-driven approach to designing and validating three DDoS attack models from temporal (e.g., attack magnitudes), spatial (e.g., attacker origin), and spatiotemporal (e.g., attack inter-launching time) perspectives. We design these models based on the analysis of traces consisting of more than 50,000 verified DDoS attacks from industrial mitigation operations. Each model is also validated by testing its effectiveness in accurately predicting future DDoS attacks. Comparisons against simple intuitive models further show that our models can more accurately capture the essential features of DDoS attacks.
Machine learning is widely used in security-sensitive settings like spam and malware detection, although it has been shown that malicious data can be carefully modified at test time to evade detection. To overcome this limitation, adversary-aware learning algorithms have been developed, exploiting robust optimization and game-theoretical models to incorporate knowledge of potential adversarial data manipulations into the learning algorithm. Despite these techniques have been shown to be effective in some adversarial learning tasks, their adoption in practice is hindered by different factors, including the difficulty of meeting specific theoretical requirements, the complexity of implementation, and scalability issues, in terms of computational time and space required during training. In this work, we aim to develop secure kernel machines against evasion attacks that are not computationally more demanding than their non-secure counterparts. In particular, leveraging recent work on robustness and regularization, we show that the security of a linear classifier can be drastically improved by selecting a proper regularizer, depending on the kind of evasion attack, as well as unbalancing the cost of classification errors. We then discuss the security of nonlinear kernel machines, and show that a proper choice of the kernel function is crucial. We also show that unbalancing the cost of classification errors and varying some kernel parameters can further improve classifier security, yielding decision functions that better enclose the legitimate data. Our results on spam and PDF malware detection corroborate our analysis.
The secure two-party computation (S2PC) protocols SHADE and GSHADE have been introduced by Bringer et al. in the last two years. The protocol GSHADE permits to compute different distances (Hamming, Euclidean, Mahalanobis) quite efficiently and is one of the most efficient compared to other S2PC methods. Thus this protocol can be used to efficiently compute one-to-many identification for several biometrics data (iris, face, fingerprint). In this paper, we introduce two extensions of GSHADE. The first one enables us to evaluate new multiplicative functions. This way, we show how to apply GSHADE to a classical machine learning algorithm. The second one is a new proposal to secure GSHADE against malicious adversaries following the recent dual execution and cut-and-choose strategies. The additional cost is very small. By preserving the GSHADE's structure, our extensions are very efficient compared to other S2PC methods.
Recently, various protocols have been proposed for securely outsourcing database storage to a third party server, ranging from systems with "full-fledged" security based on strong cryptographic primitives such as fully homomorphic encryption or oblivious RAM, to more practical implementations based on searchable symmetric encryption or even on deterministic and order-preserving encryption. On the flip side, various attacks have emerged that show that for some of these protocols confidentiality of the data can be compromised, usually given certain auxiliary information. We take a step back and identify a need for a formal understanding of the inherent efficiency/privacy trade-off in outsourced database systems, independent of the details of the system. We propose abstract models that capture secure outsourced storage systems in sufficient generality, and identify two basic sources of leakage, namely access pattern and ommunication volume. We use our models to distinguish certain classes of outsourced database systems that have been proposed, and deduce that all of them exhibit at least one of these leakage sources. We then develop generic reconstruction attacks on any system supporting range queries where either access pattern or communication volume is leaked. These attacks are in a rather weak passive adversarial model, where the untrusted server knows only the underlying query distribution. In particular, to perform our attack the server need not have any prior knowledge about the data, and need not know any of the issued queries nor their results. Yet, the server can reconstruct the secret attribute of every record in the database after about \$Ntextasciicircum4\$ queries, where N is the domain size. We provide a matching lower bound showing that our attacks are essentially optimal. Our reconstruction attacks using communication volume apply even to systems based on homomorphic encryption or oblivious RAM in the natural way. Finally, we provide experimental results demonstrating the efficacy of our attacks on real datasets with a variety of different features. On all these datasets, after the required number of queries our attacks successfully recovered the secret attributes of every record in at most a few seconds.