Visible to the public Biblio

Filters: Keyword is black-box attack  [Clear All Filters]
2023-03-31
Zhang, Junjian, Tan, Hao, Deng, Binyue, Hu, Jiacen, Zhu, Dong, Huang, Linyi, Gu, Zhaoquan.  2022.  NMI-FGSM-Tri: An Efficient and Targeted Method for Generating Adversarial Examples for Speaker Recognition. 2022 7th IEEE International Conference on Data Science in Cyberspace (DSC). :167–174.
Most existing deep neural networks (DNNs) are inexplicable and fragile, which can be easily deceived by carefully designed adversarial example with tiny undetectable noise. This allows attackers to cause serious consequences in many DNN-assisted scenarios without human perception. In the field of speaker recognition, the attack for speaker recognition system has been relatively mature. Most works focus on white-box attacks that assume the information of the DNN is obtainable, and only a few works study gray-box attacks. In this paper, we study blackbox attacks on the speaker recognition system, which can be applied in the real world since we do not need to know the system information. By combining the idea of transferable attack and query attack, our proposed method NMI-FGSM-Tri can achieve the targeted goal by misleading the system to recognize any audio as a registered person. Specifically, our method combines the Nesterov accelerated gradient (NAG), the ensemble attack and the restart trigger to design an attack method that generates the adversarial audios with good performance to attack blackbox DNNs. The experimental results show that the effect of the proposed method is superior to the extant methods, and the attack success rate can reach as high as 94.8% even if only one query is allowed.
Moraffah, Raha, Liu, Huan.  2022.  Query-Efficient Target-Agnostic Black-Box Attack. 2022 IEEE International Conference on Data Mining (ICDM). :368–377.
Adversarial attacks have recently been proposed to scrutinize the security of deep neural networks. Most blackbox adversarial attacks, which have partial access to the target through queries, are target-specific; e.g., they require a well-trained surrogate that accurately mimics a given target. In contrast, target-agnostic black-box attacks are developed to attack any target; e.g., they learn a generalized surrogate that can adapt to any target via fine-tuning on samples queried from the target. Despite their success, current state-of-the-art target-agnostic attacks require tremendous fine-tuning steps and consequently an immense number of queries to the target to generate successful attacks. The high query complexity of these attacks makes them easily detectable and thus defendable. We propose a novel query-efficient target-agnostic attack that trains a generalized surrogate network to output the adversarial directions iv.r.t. the inputs and equip it with an effective fine-tuning strategy that only fine-tunes the surrogate when it fails to provide useful directions to generate the attacks. Particularly, we show that to effectively adapt to any target and generate successful attacks, it is sufficient to fine-tune the surrogate with informative samples that help the surrogate get out of the failure mode with additional information on the target’s local behavior. Extensive experiments on CIFAR10 and CIFAR-100 datasets demonstrate that the proposed target-agnostic approach can generate highly successful attacks for any target network with very few fine-tuning steps and thus significantly smaller number of queries (reduced by several order of magnitudes) compared to the state-of-the-art baselines.
2023-01-06
Jagadeesha, Nishchal.  2022.  Facial Privacy Preservation using FGSM and Universal Perturbation attacks. 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON). 1:46—52.
Research done in Facial Privacy so far has entrenched the scope of gleaning race, age, and gender from a human’s facial image that are classifiable and compliant biometric attributes. Noticeable distortions, morphing, and face-swapping are some of the techniques that have been researched to restore consumers’ privacy. By fooling face recognition models, these techniques cater superficially to the needs of user privacy, however, the presence of visible manipulations negatively affects the aesthetic of the image. The objective of this work is to highlight common adversarial techniques that can be used to introduce granular pixel distortions using white-box and black-box perturbation algorithms that ensure the privacy of users’ sensitive or personal data in face images, fooling AI facial recognition models while maintaining the aesthetics of and visual integrity of the image.
2022-12-20
Liu, Xiaolei, Li, Xiaoyu, Zheng, Desheng, Bai, Jiayu, Peng, Yu, Zhang, Shibin.  2022.  Automatic Selection Attacks Framework for Hard Label Black-Box Models. IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :1–7.

The current adversarial attacks against machine learning models can be divided into white-box attacks and black-box attacks. Further the black-box can be subdivided into soft label and hard label black-box, but the latter has the deficiency of only returning the class with the highest prediction probability, which leads to the difficulty in gradient estimation. However, due to its wide application, it is of great research significance and application value to explore hard label blackbox attacks. This paper proposes an Automatic Selection Attacks Framework (ASAF) for hard label black-box models, which can be explained in two aspects based on the existing attack methods. Firstly, ASAF applies model equivalence to select substitute models automatically so as to generate adversarial examples and then completes black-box attacks based on their transferability. Secondly, specified feature selection and parallel attack method are proposed to shorten the attack time and improve the attack success rate. The experimental results show that ASAF can achieve more than 90% success rate of nontargeted attack on the common models of traditional dataset ResNet-101 (CIFAR10) and InceptionV4 (ImageNet). Meanwhile, compared with FGSM and other attack algorithms, the attack time is reduced by at least 89.7% and 87.8% respectively in two traditional datasets. Besides, it can achieve 90% success rate of attack on the online model, BaiduAI digital recognition. In conclusion, ASAF is the first automatic selection attacks framework for hard label blackbox models, in which specified feature selection and parallel attack methods speed up automatic attacks.

2022-11-18
Khoshavi, Navid, Sargolzaei, Saman, Bi, Yu, Roohi, Arman.  2021.  Entropy-Based Modeling for Estimating Adversarial Bit-flip Attack Impact on Binarized Neural Network. 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). :493–498.
Over past years, the high demand to efficiently process deep learning (DL) models has driven the market of the chip design companies. However, the new Deep Chip architectures, a common term to refer to DL hardware accelerator, have slightly paid attention to the security requirements in quantized neural networks (QNNs), while the black/white -box adversarial attacks can jeopardize the integrity of the inference accelerator. Therefore in this paper, a comprehensive study of the resiliency of QNN topologies to black-box attacks is examined. Herein, different attack scenarios are performed on an FPGA-processor co-design, and the collected results are extensively analyzed to give an estimation of the impact’s degree of different types of attacks on the QNN topology. To be specific, we evaluated the sensitivity of the QNN accelerator to a range number of bit-flip attacks (BFAs) that might occur in the operational lifetime of the device. The BFAs are injected at uniformly distributed times either across the entire QNN or per individual layer during the image classification. The acquired results are utilized to build the entropy-based model that can be leveraged to construct resilient QNN architectures to bit-flip attacks.
2022-01-31
Zhang, Yun, Li, Hongwei, Xu, Guowen, Luo, Xizhao, Dong, Guishan.  2021.  Generating Audio Adversarial Examples with Ensemble Substituted Models. ICC 2021 - IEEE International Conference on Communications. :1–6.
The rapid development of machine learning technology has prompted the applications of Automatic Speech Recognition(ASR). However, studies have shown that the state-of-the-art ASR technologies are still vulnerable to various attacks, which undermines the stability of ASR destructively. In general, most of the existing attack techniques for the ASR model are based on white box scenarios, where the adversary uses adversarial samples to generate a substituted model corresponding to the target model. On the contrary, there are fewer attack schemes in the black-box scenario. Moreover, no scheme considers the problem of how to construct the architecture of the substituted models. In this paper, we point out that constructing a good substituted model architecture is crucial to the effectiveness of the attack, as it helps to generate a more sophisticated set of adversarial examples. We evaluate the performance of different substituted models by comprehensive experiments, and find that ensemble substituted models can achieve the optimal attack effect. The experiment shows that our approach performs attack over 80% success rate (2% improvement compared to the latest work) meanwhile maintaining the authenticity of the original sample well.
2021-07-27
Fan, Wenshu, Li, Hongwei, Jiang, Wenbo, Xu, Guowen, Lu, Rongxing.  2020.  A Practical Black-Box Attack Against Autonomous Speech Recognition Model. GLOBECOM 2020 - 2020 IEEE Global Communications Conference. :1–6.
With the wild applications of machine learning (ML) technology, automatic speech recognition (ASR) has made great progress in recent years. Despite its great potential, there are various evasion attacks of ML-based ASR, which could affect the security of applications built upon ASR. Up to now, most studies focus on white-box attacks in ASR, and there is almost no attention paid to black-box attacks where attackers can only query the target model to get output labels rather than probability vectors in audio domain. In this paper, we propose an evasion attack against ASR in the above-mentioned situation, which is more feasible in realistic scenarios. Specifically, we first train a substitute model by using data augmentation, which ensures that we have enough samples to train with a small number of times to query the target model. Then, based on the substitute model, we apply Differential Evolution (DE) algorithm to craft adversarial examples and implement black-box attack against ASR models from the Speech Commands dataset. Extensive experiments are conducted, and the results illustrate that our approach achieves untargeted attacks with over 70% success rate while still maintaining the authenticity of the original data well.
2021-05-20
Yu, Jia ao, Peng, Lei.  2020.  Black-box Attacks on DNN Classifier Based on Fuzzy Adversarial Examples. 2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP). :965—969.
The security of deep learning becomes increasing important with the more and more related applications. The adversarial attack is the known method that makes the performance of deep learning network (DNN) decline rapidly. However, adversarial attack needs the gradient knowledge of the target networks to craft the specific adversarial examples, which is the white-box attack and hardly becomes true in reality. In this paper, we implement a black-box attack on DNN classifier via a functionally equivalent network without knowing the internal structure and parameters of the target networks. And we increase the entropy of the noise via deep convolution generative adversarial networks (DCGAN) to make it seems fuzzier, avoiding being probed and eliminated easily by adversarial training. Experiments show that this method can produce a large number of adversarial examples quickly in batch and the target network cannot improve its accuracy via adversarial training simply.
2021-03-04
Carlini, N., Farid, H..  2020.  Evading Deepfake-Image Detectors with White- and Black-Box Attacks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :2804—2813.

It is now possible to synthesize highly realistic images of people who do not exist. Such content has, for example, been implicated in the creation of fraudulent socialmedia profiles responsible for dis-information campaigns. Significant efforts are, therefore, being deployed to detect synthetically-generated content. One popular forensic approach trains a neural network to distinguish real from synthetic content.We show that such forensic classifiers are vulnerable to a range of attacks that reduce the classifier to near- 0% accuracy. We develop five attack case studies on a state- of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators, when only trained on one generator. With full access to the classifier, we can flip the lowest bit of each pixel in an image to reduce the classifier's AUC to 0.0005; perturb 1% of the image area to reduce the classifier's AUC to 0.08; or add a single noise pattern in the synthesizer's latent space to reduce the classifier's AUC to 0.17. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22. These attacks reveal significant vulnerabilities of certain image-forensic classifiers.

Guo, H., Wang, Z., Wang, B., Li, X., Shila, D. M..  2020.  Fooling A Deep-Learning Based Gait Behavioral Biometric System. 2020 IEEE Security and Privacy Workshops (SPW). :221—227.

We leverage deep learning algorithms on various user behavioral information gathered from end-user devices to classify a subject of interest. In spite of the ability of these techniques to counter spoofing threats, they are vulnerable to adversarial learning attacks, where an attacker adds adversarial noise to the input samples to fool the classifier into false acceptance. Recently, a handful of mature techniques like Fast Gradient Sign Method (FGSM) have been proposed to aid white-box attacks, where an attacker has a complete knowledge of the machine learning model. On the contrary, we exploit a black-box attack to a behavioral biometric system based on gait patterns, by using FGSM and training a shadow model that mimics the target system. The attacker has limited knowledge on the target model and no knowledge of the real user being authenticated, but induces a false acceptance in authentication. Our goal is to understand the feasibility of a black-box attack and to what extent FGSM on shadow models would contribute to its success. Our results manifest that the performance of FGSM highly depends on the quality of the shadow model, which is in turn impacted by key factors including the number of queries allowed by the target system in order to train the shadow model. Our experimentation results have revealed strong relationships between the shadow model and FGSM performance, as well as the effect of the number of FGSM iterations used to create an attack instance. These insights also shed light on deep-learning algorithms' model shareability that can be exploited to launch a successful attack.

2020-09-04
Khan, Aasher, Rehman, Suriya, Khan, Muhammad U.S, Ali, Mazhar.  2019.  Synonym-based Attack to Confuse Machine Learning Classifiers Using Black-box Setting. 2019 4th International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST). :1—7.
Twitter being the most popular content sharing platform is giving rise to automated accounts called “bots”. Majority of the users on Twitter are bots. Various machine learning (ML) algorithms are designed to detect bots avoiding the vulnerability constraints of ML-based models. This paper contributes to exploit vulnerabilities of machine learning (ML) algorithms through black-box attack. An adversarial text sequence misclassifies the results of deep learning (DL) classifiers for bot detection. Literature shows that ML models are vulnerable to attacks. The aim of this paper is to compromise the accuracy of ML-based bot detection algorithms by replacing original words in tweets with their synonyms. Our results show 7.2% decrease in the accuracy for bot tweets, therefore classifying bot tweets as legitimate tweets.
2019-01-16
Gao, J., Lanchantin, J., Soffa, M. L., Qi, Y..  2018.  Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. 2018 IEEE Security and Privacy Workshops (SPW). :50–56.

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB. Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.

2018-11-19
Papernot, Nicolas, McDaniel, Patrick, Goodfellow, Ian, Jha, Somesh, Celik, Z. Berkay, Swami, Ananthram.  2017.  Practical Black-Box Attacks Against Machine Learning. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. :506–519.

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

2018-06-07
Chen, Pin-Yu, Zhang, Huan, Sharma, Yash, Yi, Jinfeng, Hsieh, Cho-Jui.  2017.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks Without Training Substitute Models. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. :15–26.
Deep neural networks (DNNs) are one of the most prominent technologies of our time, as they achieve state-of-the-art performance in many machine learning tasks, including but not limited to image classification, text mining, and speech processing. However, recent research on DNNs has indicated ever-increasing concern on the robustness to adversarial examples, especially for security-critical tasks such as traffic sign identification for autonomous driving. Studies have unveiled the vulnerability of a well-trained DNN by demonstrating the ability of generating barely noticeable (to both human and machines) adversarial images that lead to misclassification. Furthermore, researchers have shown that these adversarial images are highly transferable by simply training and attacking a substitute model built upon the target model, known as a black-box attack to DNNs. Similar to the setting of training substitute models, in this paper we propose an effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN. However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples. We use zeroth order stochastic coordinate descent along with dimension reduction, hierarchical attack and importance sampling techniques to efficiently attack black-box models. By exploiting zeroth order optimization, improved attacks to the targeted DNN can be accomplished, sparing the need for training substitute models and avoiding the loss in attack transferability. Experimental results on MNIST, CIFAR10 and ImageNet show that the proposed ZOO attack is as effective as the state-of-the-art white-box attack (e.g., Carlini and Wagner's attack) and significantly outperforms existing black-box attacks via substitute models.