Visible to the public Biblio

Filters: Keyword is Adversarial Machine Learning  [Clear All Filters]
2020-12-28
Barni, M., Nowroozi, E., Tondi, B., Zhang, B..  2020.  Effectiveness of Random Deep Feature Selection for Securing Image Manipulation Detectors Against Adversarial Examples. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2977—2981.

We investigate if the random feature selection approach proposed in [1] to improve the robustness of forensic detectors to targeted attacks, can be extended to detectors based on deep learning features. In particular, we study the transferability of adversarial examples targeting an original CNN image manipulation detector to other detectors (a fully connected neural network and a linear SVM) that rely on a random subset of the features extracted from the flatten layer of the original network. The results we got by considering three image manipulation detection tasks (resizing, median filtering and adaptive histogram equalization), two original network architectures and three classes of attacks, show that feature randomization helps to hinder attack transferability, even if, in some cases, simply changing the architecture of the detector, or even retraining the detector is enough to prevent the transferability of the attacks.

Raju, R. S., Lipasti, M..  2020.  BlurNet: Defense by Filtering the Feature Maps. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :38—46.

Recently, the field of adversarial machine learning has been garnering attention by showing that state-of-the-art deep neural networks are vulnerable to adversarial examples, stemming from small perturbations being added to the input image. Adversarial examples are generated by a malicious adversary by obtaining access to the model parameters, such as gradient information, to alter the input or by attacking a substitute model and transferring those malicious examples over to attack the victim model. Specifically, one of these attack algorithms, Robust Physical Perturbations (RP2), generates adversarial images of stop signs with black and white stickers to achieve high targeted misclassification rates against standard-architecture traffic sign classifiers. In this paper, we propose BlurNet, a defense against the RP2 attack. First, we motivate the defense with a frequency analysis of the first layer feature maps of the network on the LISA dataset, which shows that high frequency noise is introduced into the input image by the RP2 algorithm. To remove the high frequency noise, we introduce a depthwise convolution layer of standard blur kernels after the first layer. We perform a blackbox transfer attack to show that low-pass filtering the feature maps is more beneficial than filtering the input. We then present various regularization schemes to incorporate this lowpass filtering behavior into the training regime of the network and perform white-box attacks. We conclude with an adaptive attack evaluation to show that the success rate of the attack drops from 90% to 20% with total variation regularization, one of the proposed defenses.

2020-12-01
Usama, M., Asim, M., Latif, S., Qadir, J., Ala-Al-Fuqaha.  2019.  Generative Adversarial Networks For Launching and Thwarting Adversarial Attacks on Network Intrusion Detection Systems. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :78—83.

Intrusion detection systems (IDSs) are an essential cog of the network security suite that can defend the network from malicious intrusions and anomalous traffic. Many machine learning (ML)-based IDSs have been proposed in the literature for the detection of malicious network traffic. However, recent works have shown that ML models are vulnerable to adversarial perturbations through which an adversary can cause IDSs to malfunction by introducing a small impracticable perturbation in the network traffic. In this paper, we propose an adversarial ML attack using generative adversarial networks (GANs) that can successfully evade an ML-based IDS. We also show that GANs can be used to inoculate the IDS and make it more robust to adversarial perturbations.

2020-11-04
Khalid, F., Hanif, M. A., Rehman, S., Ahmed, R., Shafique, M..  2019.  TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks. 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS). :188—193.

Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference, or can be identified during the validation phase. There-fore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural similarity analysis, or noticeable (e.g., by humans) in multi-level security system. Moreover, majority of the inference attack rely on some knowledge about the training dataset. In this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs, without requiring any information about the training dataset (i.e., completely training data-unaware). We present a case study on traffic sign detection using the VGGNet trained on the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both “subjective” and “objective” quality tests.

2020-09-04
Usama, Muhammad, Qayyum, Adnan, Qadir, Junaid, Al-Fuqaha, Ala.  2019.  Black-box Adversarial Machine Learning Attack on Network Traffic Classification. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :84—89.

Deep machine learning techniques have shown promising results in network traffic classification, however, the robustness of these techniques under adversarial threats is still in question. Deep machine learning models are found vulnerable to small carefully crafted adversarial perturbations posing a major question on the performance of deep machine learning techniques. In this paper, we propose a black-box adversarial attack on network traffic classification. The proposed attack successfully evades deep machine learning-based classifiers which highlights the potential security threat of using deep machine learning techniques to realize autonomous networks.

2020-08-03
Juuti, Mika, Szyller, Sebastian, Marchal, Samuel, Asokan, N..  2019.  PRADA: Protecting Against DNN Model Stealing Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :512–527.
Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44 percentage points, pp), and prediction accuracy (up to +46 pp) on two datasets. We provide take-aways on how to perform effective model extraction attacks. We then propose PRADA, the first step towards generic and effective detection of DNN model extraction attacks. It analyzes the distribution of consecutive API queries and raises an alarm when this distribution deviates from benign behavior. We show that PRADA can detect all prior model extraction attacks with no false positives.
2020-07-13
Mahmood, Shah.  2019.  The Anti-Data-Mining (ADM) Framework - Better Privacy on Online Social Networks and Beyond. 2019 IEEE International Conference on Big Data (Big Data). :5780–5788.
The unprecedented and enormous growth of cloud computing, especially online social networks, has resulted in numerous incidents of the loss of users' privacy. In this paper, we provide a framework, based on our anti-data-mining (ADM) principle, to enhance users' privacy against adversaries including: online social networks; search engines; financial terminal providers; ad networks; eavesdropping governments; and other parties who can monitor users' content from the point where the content leaves users' computers to within the data centers of these information accumulators. To achieve this goal, our framework proactively uses the principles of suppression of sensitive data and disinformation. Moreover, we use social-bots in a novel way for enhanced privacy and provide users' with plausible deniability for their photos, audio, and video content uploaded online.
2020-07-03
Usama, Muhammad, Asim, Muhammad, Qadir, Junaid, Al-Fuqaha, Ala, Imran, Muhammad Ali.  2019.  Adversarial Machine Learning Attack on Modulation Classification. 2019 UK/ China Emerging Technologies (UCET). :1—4.

Modulation classification is an important component of cognitive self-driving networks. Recently many ML-based modulation classification methods have been proposed. We have evaluated the robustness of 9 ML-based modulation classifiers against the powerful Carlini & Wagner (C-W) attack and showed that the current ML-based modulation classifiers do not provide any deterrence against adversarial ML examples. To the best of our knowledge, we are the first to report the results of the application of the C-W attack for creating adversarial examples against various ML models for modulation classification.

Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin.  2019.  Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.
2020-02-18
Huang, Yonghong, Verma, Utkarsh, Fralick, Celeste, Infantec-Lopez, Gabriel, Kumar, Brajesh, Woodward, Carl.  2019.  Malware Evasion Attack and Defense. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :34–38.

Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.

2020-01-28
Pajola, Luca, Pasa, Luca, Conti, Mauro.  2019.  Threat Is in the Air: Machine Learning for Wireless Network Applications. Proceedings of the ACM Workshop on Wireless Security and Machine Learning. :16–21.

With the spread of wireless application, huge amount of data is generated every day. Thanks to its elasticity, machine learning is becoming a fundamental brick in this field, and many of applications are developed with the use of it and the several techniques that it offers. However, machine learning suffers on different problems and people that use it often are not aware of the possible threats. Often, an adversary tries to exploit these vulnerabilities in order to obtain benefits; because of this, adversarial machine learning is becoming wide studied in the scientific community. In this paper, we show state-of-the-art adversarial techniques and possible countermeasures, with the aim of warning people regarding sensible argument related to the machine learning.

Zizzo, Giulio, Hankin, Chris, Maffeis, Sergio, Jones, Kevin.  2019.  Adversarial Machine Learning Beyond the Image Domain. Proceedings of the 56th Annual Design Automation Conference 2019. :1–4.
Machine learning systems have had enormous success in a wide range of fields from computer vision, natural language processing, and anomaly detection. However, such systems are vulnerable to attackers who can cause deliberate misclassification by introducing small perturbations. With machine learning systems being proposed for cyber attack detection such attackers are cause for serious concern. Despite this the vast majority of adversarial machine learning security research is focused on the image domain. This work gives a brief overview of adversarial machine learning and machine learning used in cyber attack detection and suggests key differences between the traditional image domain of adversarial machine learning and the cyber domain. Finally we show an adversarial machine learning attack on an industrial control system.
2019-02-22
McKnight, Christopher, Goldberg, Ian.  2018.  Style Counsel: Seeing the (Random) Forest for the Trees in Adversarial Code Stylometry. Proceedings of the 2018 Workshop on Privacy in the Electronic Society. :138-142.

The results of recent experiments have suggested that code stylometry can successfully identify the author of short programs from among hundreds of candidates with up to 98% precision. This potential ability to discern the programmer of a code sample from a large group of possible authors could have concerning consequences for the open-source community at large, particularly those contributors that may wish to remain anonymous. Recent international events have suggested the developers of certain anti-censorship and anti-surveillance tools are being targeted by their governments and forced to delete their repositories or face prosecution. In light of this threat to the freedom and privacy of individual programmers around the world, we devised a tool, Style Counsel, to aid programmers in obfuscating their inherent style and imitating another, overt, author's style in order to protect their anonymity from this forensic technique. Our system utilizes the implicit rules encoded in the decision points of a random forest ensemble in order to derive a set of recommendations to present to the user detailing how to achieve this obfuscation and mimicry attack.

2019-02-08
Fang, Minghong, Yang, Guolei, Gong, Neil Zhenqiang, Liu, Jia.  2018.  Poisoning Attacks to Graph-Based Recommender Systems. Proceedings of the 34th Annual Computer Security Applications Conference. :381-392.

Recommender system is an important component of many web services to help users locate items that match their interests. Several studies showed that recommender systems are vulnerable to poisoning attacks, in which an attacker injects fake data to a recommender system such that the system makes recommendations as the attacker desires. However, these poisoning attacks are either agnostic to recommendation algorithms or optimized to recommender systems (e.g., association-rule-based or matrix-factorization-based recommender systems) that are not graph-based. Like association-rule-based and matrix-factorization-based recommender systems, graph-based recommender system is also deployed in practice, e.g., eBay, Huawei App Store (a big app store in China). However, how to design optimized poisoning attacks for graph-based recommender systems is still an open problem. In this work, we perform a systematic study on poisoning attacks to graph-based recommender systems. We consider an attacker's goal is to promote a target item to be recommended to as many users as possible. To achieve this goal, our a"acks inject fake users with carefully crafted rating scores to the recommender system. Due to limited resources and to avoid detection, we assume the number of fake users that can be injected into the system is bounded. The key challenge is how to assign rating scores to the fake users such that the target item is recommended to as many normal users as possible. To address the challenge, we formulate the poisoning attacks as an optimization problem, solving which determines the rating scores for the fake users. We also propose techniques to solve the optimization problem. We evaluate our attacks and compare them with existing attacks under white-box (recommendation algorithm and its parameters are known), gray-box (recommendation algorithm is known but its parameters are unknown), and blackbox (recommendation algorithm is unknown) settings using two real-world datasets. Our results show that our attack is effective and outperforms existing attacks for graph-based recommender systems. For instance, when 1% of users are injected fake users, our attack can make a target item recommended to 580 times more normal users in certain scenarios.

Zügner, Daniel, Akbarnejad, Amir, Günnemann, Stephan.  2018.  Adversarial Attacks on Neural Networks for Graph Data. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. :2847-2856.
Deep learning models for graphs have achieved strong performance for the task of node classification. Despite their proliferation, currently there is no study of their robustness to adversarial attacks. Yet, in domains where they are likely to be used, e.g. the web, adversaries are common. Can deep learning models for graphs be easily fooled? In this work, we introduce the first study of adversarial attacks on attributed graphs, specifically focusing on models exploiting ideas of graph convolutions. In addition to attacks at test time, we tackle the more challenging class of poisoning/causative attacks, which focus on the training phase of a machine learning model.We generate adversarial perturbations targeting the node's features and the graph structure, thus, taking the dependencies between instances in account. Moreover, we ensure that the perturbations remain unnoticeable by preserving important data characteristics. To cope with the underlying discrete domain we propose an efficient algorithm Nettack exploiting incremental computations. Our experimental study shows that accuracy of node classification significantly drops even when performing only few perturbations. Even more, our attacks are transferable: the learned attacks generalize to other state-of-the-art node classification models and unsupervised approaches, and likewise are successful even when only limited knowledge about the graph is given.
Das, Nilaksh, Shanbhogue, Madhuri, Chen, Shang-Tse, Hohman, Fred, Li, Siwei, Chen, Li, Kounavis, Michael E., Chau, Duen Horng.  2018.  SHIELD: Fast, Practical Defense and Vaccination for Deep Learning Using JPEG Compression. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. :196-204.

The rapidly growing body of research in adversarial machine learning has demonstrated that deep neural networks (DNNs) are highly vulnerable to adversarially generated images. This underscores the urgent need for practical defense techniques that can be readily deployed to combat attacks in real-time. Observing that many attack strategies aim to perturb image pixels in ways that are visually imperceptible, we place JPEG compression at the core of our proposed SHIELD defense framework, utilizing its capability to effectively "compress away" such pixel manipulation. To immunize a DNN model from artifacts introduced by compression, SHIELD "vaccinates" the model by retraining it with compressed images, where different compression levels are applied to generate multiple vaccinated models that are ultimately used together in an ensemble defense. On top of that, SHIELD adds an additional layer of protection by employing randomization at test time that compresses different regions of an image using random compression levels, making it harder for an adversary to estimate the transformation performed. This novel combination of vaccination, ensembling, and randomization makes SHIELD a fortified multi-pronged defense. We conducted extensive, large-scale experiments using the ImageNet dataset, and show that our approaches eliminate up to 98% of gray-box attacks delivered by strong adversarial techniques such as Carlini-Wagner's L2 attack and DeepFool. Our approaches are fast and work without requiring knowledge about the model.

2018-11-19
Papernot, Nicolas, McDaniel, Patrick, Goodfellow, Ian, Jha, Somesh, Celik, Z. Berkay, Swami, Ananthram.  2017.  Practical Black-Box Attacks Against Machine Learning. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. :506–519.

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

2018-09-05
Chen, Yizheng, Nadji, Yacin, Kountouras, Athanasios, Monrose, Fabian, Perdisci, Roberto, Antonakakis, Manos, Vasiloglou, Nikolaos.  2017.  Practical Attacks Against Graph-based Clustering. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :1125–1142.
Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art network-level, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specifically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.
2018-07-06
Baracaldo, Nathalie, Chen, Bryant, Ludwig, Heiko, Safavi, Jaehoon Amir.  2017.  Mitigating Poisoning Attacks on Machine Learning Models: A Data Provenance Based Approach. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. :103–110.
The use of machine learning models has become ubiquitous. Their predictions are used to make decisions about healthcare, security, investments and many other critical applications. Given this pervasiveness, it is not surprising that adversaries have an incentive to manipulate machine learning models to their advantage. One way of manipulating a model is through a poisoning or causative attack in which the adversary feeds carefully crafted poisonous data points into the training set. Taking advantage of recently developed tamper-free provenance frameworks, we present a methodology that uses contextual information about the origin and transformation of data points in the training set to identify poisonous data, thereby enabling online and regularly re-trained machine learning applications to consume data sources in potentially adversarial environments. To the best of our knowledge, this is the first approach to incorporate provenance information as part of a filtering algorithm to detect causative attacks. We present two variations of the methodology - one tailored to partially trusted data sets and the other to fully untrusted data sets. Finally, we evaluate our methodology against existing methods to detect poison data and show an improvement in the detection rate.
Liu, Chang, Li, Bo, Vorobeychik, Yevgeniy, Oprea, Alina.  2017.  Robust Linear Regression Against Training Data Poisoning. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. :91–102.
The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the most important factors in predicting outcomes. However, the economic importance of learning has made it a natural target for adversarial manipulation of training data, which we term poisoning attacks. Prior approaches to dealing with robust supervised learning rely on strong assumptions about the nature of the feature matrix, such as feature independence and sub-Gaussian noise with low variance. We propose an integrated method for robust regression that relaxes these assumptions, assuming only that the feature matrix can be well approximated by a low-rank matrix. Our techniques integrate improved robust low-rank matrix approximation and robust principle component regression, and yield strong performance guarantees. Moreover, we experimentally show that our methods significantly outperform state of the art both in running time and prediction error.
Biggio, Battista, Rieck, Konrad, Ariu, Davide, Wressnegger, Christian, Corona, Igino, Giacinto, Giorgio, Roli, Fabio.  2014.  Poisoning Behavioral Malware Clustering. Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop. :27–36.
Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.
2018-05-16
Khasawneh, Khaled N., Abu-Ghazaleh, Nael, Ponomarev, Dmitry, Yu, Lei.  2017.  RHMD: Evasion-resilient Hardware Malware Detectors. Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. :315–327.

Hardware Malware Detectors (HMDs) have recently been proposed as a defense against the proliferation of malware. These detectors use low-level features, that can be collected by the hardware performance monitoring units on modern CPUs to detect malware as a computational anomaly. Several aspects of the detector construction have been explored, leading to detectors with high accuracy. In this paper, we explore the question of how well evasive malware can avoid detection by HMDs. We show that existing HMDs can be effectively reverse-engineered and subsequently evaded, allowing malware to hide from detection without substantially slowing it down (which is important for certain types of malware). This result demonstrates that the current generation of HMDs can be easily defeated by evasive malware. Next, we explore how well a detector can evolve if it is exposed to this evasive malware during training. We show that simple detectors, such as logistic regression, cannot detect the evasive malware even with retraining. More sophisticated detectors can be retrained to detect evasive malware, but the retrained detectors can be reverse-engineered and evaded again. To address these limitations, we propose a new type of Resilient HMDs (RHMDs) that stochastically switch between different detectors. These detectors can be shown to be provably more difficult to reverse engineer based on resent results in probably approximately correct (PAC) learnability theory. We show that indeed such detectors are resilient to both reverse engineering and evasion, and that the resilience increases with the number and diversity of the individual detectors. Our results demonstrate that these HMDs offer effective defense against evasive malware at low additional complexity.

2018-05-01
Wang, Weiyu, Zhu, Quanyan.  2017.  On the Detection of Adversarial Attacks Against Deep Neural Networks. Proceedings of the 2017 Workshop on Automated Decision Making for Active Cyber Defense. :27–30.

Deep learning model has been widely studied and proven to achieve high accuracy in various pattern recognition tasks, especially in image recognition. However, due to its non-linear architecture and high-dimensional inputs, its ill-posedness [1] towards adversarial perturbations-small deliberately crafted perturbations on the input will lead to completely different outputs, has also attracted researchers' attention. This work takes the traffic sign recognition system on the self-driving car as an example, and aims at designing an additional mechanism to improve the robustness of the recognition system. It uses a machine learning model which learns the results of the deep learning model's predictions, with human feedback as labels and provides the credibility of current prediction. The mechanism makes use of both the input image and the recognition result as sample space, querying a human user the True/False of current classification result the least number of times, and completing the task of detecting adversarial attacks.

Burkard, Cody, Lagesse, Brent.  2017.  Analysis of Causative Attacks Against SVMs Learning from Data Streams. Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics. :31–36.

Machine learning algorithms have been proven to be vulnerable to a special type of attack in which an active adversary manipulates the training data of the algorithm in order to reach some desired goal. Although this type of attack has been proven in previous work, it has not been examined in the context of a data stream, and no work has been done to study a targeted version of the attack. Furthermore, current literature does not provide any metrics that allow a system to detect these attack while they are happening. In this work, we examine the targeted version of this attack on a Support Vector Machine(SVM) that is learning from a data stream, and examine the impact that this attack has on current metrics that are used to evaluate a models performance. We then propose a new metric for detecting these attacks, and compare its performance against current metrics.

2018-04-11
Muñoz-González, Luis, Biggio, Battista, Demontis, Ambra, Paudice, Andrea, Wongrassamee, Vasin, Lupu, Emil C., Roli, Fabio.  2017.  Towards Poisoning of Deep Learning Algorithms with Back-Gradient Optimization. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. :27–38.

A number of online services nowadays rely upon machine learning to extract valuable information from data collected in the wild. This exposes learning algorithms to the threat of data poisoning, i.e., a coordinate attack in which a fraction of the training data is controlled by the attacker and manipulated to subvert the learning process. To date, these attacks have been devised only against a limited class of binary learning algorithms, due to the inherent complexity of the gradient-based procedure used to optimize the poisoning points (a.k.a. adversarial training examples). In this work, we first extend the definition of poisoning attacks to multiclass problems. We then propose a novel poisoning algorithm based on the idea of back-gradient optimization, i.e., to compute the gradient of interest through automatic differentiation, while also reversing the learning procedure to drastically reduce the attack complexity. Compared to current poisoning strategies, our approach is able to target a wider class of learning algorithms, trained with gradient-based procedures, including neural networks and deep learning architectures. We empirically evaluate its effectiveness on several application examples, including spam filtering, malware detection, and handwritten digit recognition. We finally show that, similarly to adversarial test examples, adversarial training examples can also be transferred across different learning algorithms.