Visible to the public Biblio

Found 104 results

Filters: Keyword is Perturbation methods  [Clear All Filters]
2020-09-11
Azakami, Tomoka, Shibata, Chihiro, Uda, Ryuya, Kinoshita, Toshiyuki.  2019.  Creation of Adversarial Examples with Keeping High Visual Performance. 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT). :52—56.
The accuracy of the image classification by the convolutional neural network is exceeding the ability of human being and contributes to various fields. However, the improvement of the image recognition technology gives a great blow to security system with an image such as CAPTCHA. In particular, since the character string CAPTCHA has already added distortion and noise in order not to be read by the computer, it becomes a problem that the human readability is lowered. Adversarial examples is a technique to produce an image letting an image classification by the machine learning be wrong intentionally. The best feature of this technique is that when human beings compare the original image with the adversarial examples, they cannot understand the difference on appearance. However, Adversarial examples that is created with conventional FGSM cannot completely misclassify strong nonlinear networks like CNN. Osadchy et al. have researched to apply this adversarial examples to CAPTCHA and attempted to let CNN misclassify them. However, they could not let CNN misclassify character images. In this research, we propose a method to apply FGSM to the character string CAPTCHAs and to let CNN misclassified them.
2020-09-04
Bartan, Burak, Pilanci, Mert.  2019.  Distributed Black-Box optimization via Error Correcting Codes. 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton). :246—252.
We introduce a novel distributed derivative-free optimization framework that is resilient to stragglers. The proposed method employs coded search directions at which the objective function is evaluated, and a decoding step to find the next iterate. Our framework can be seen as an extension of evolution strategies and structured exploration methods where structured search directions were utilized. As an application, we consider black-box adversarial attacks on deep convolutional neural networks. Our numerical experiments demonstrate a significant improvement in the computation times.
Wu, Yi, Liu, Jian, Chen, Yingying, Cheng, Jerry.  2019.  Semi-black-box Attacks Against Speech Recognition Systems Using Adversarial Samples. 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). :1—5.
As automatic speech recognition (ASR) systems have been integrated into a diverse set of devices around us in recent years, security vulnerabilities of them have become an increasing concern for the public. Existing studies have demonstrated that deep neural networks (DNNs), acting as the computation core of ASR systems, is vulnerable to deliberately designed adversarial attacks. Based on the gradient descent algorithm, existing studies have successfully generated adversarial samples which can disturb ASR systems and produce adversary-expected transcript texts designed by adversaries. Most of these research simulated white-box attacks which require knowledge of all the components in the targeted ASR systems. In this work, we propose the first semi-black-box attack against the ASR system - Kaldi. Requiring only partial information from Kaldi and none from DNN, we can embed malicious commands into a single audio chip based on the gradient-independent genetic algorithm. The crafted audio clip could be recognized as the embedded malicious commands by Kaldi and unnoticeable to humans in the meanwhile. Experiments show that our attack can achieve high attack success rate with unnoticeable perturbations to three types of audio clips (pop music, pure music, and human command) without the need of the underlying DNN model parameters and architecture.
Zhao, Pu, Liu, Sijia, Chen, Pin-Yu, Hoang, Nghia, Xu, Kaidi, Kailkhura, Bhavya, Lin, Xue.  2019.  On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :121—130.
Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.
Usama, Muhammad, Qayyum, Adnan, Qadir, Junaid, Al-Fuqaha, Ala.  2019.  Black-box Adversarial Machine Learning Attack on Network Traffic Classification. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :84—89.

Deep machine learning techniques have shown promising results in network traffic classification, however, the robustness of these techniques under adversarial threats is still in question. Deep machine learning models are found vulnerable to small carefully crafted adversarial perturbations posing a major question on the performance of deep machine learning techniques. In this paper, we propose a black-box adversarial attack on network traffic classification. The proposed attack successfully evades deep machine learning-based classifiers which highlights the potential security threat of using deep machine learning techniques to realize autonomous networks.

Jing, Huiyun, Meng, Chengrui, He, Xin, Wei, Wei.  2019.  Black Box Explanation Guided Decision-Based Adversarial Attacks. 2019 IEEE 5th International Conference on Computer and Communications (ICCC). :1592—1596.
Adversarial attacks have been the hot research field in artificial intelligence security. Decision-based black-box adversarial attacks are much more appropriate in the real-world scenarios, where only the final decisions of the targeted deep neural networks are accessible. However, since there is no available guidance for searching the imperceptive adversarial perturbation, boundary attack, one of the best performing decision-based black-box attacks, carries out computationally expensive search. For improving attack efficiency, we propose a novel black box explanation guided decision-based black-box adversarial attack. Firstly, the problem of decision-based adversarial attacks is modeled as a derivative-free and constraint optimization problem. To solve this optimization problem, the black box explanation guided constrained random search method is proposed to more quickly find the imperceptible adversarial example. The insights into the targeted deep neural networks explored by the black box explanation are fully used to accelerate the computationally expensive random search. Experimental results demonstrate that our proposed attack improves the attack efficiency by 64% compared with boundary attack.
2020-08-13
Junjie, Jia, Haitao, Qin, Wanghu, Chen, Huifang, Ma.  2019.  Trajectory Anonymity Based on Quadratic Anonymity. 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE). :485—492.
Due to the leakage of privacy information in the sensitive region of trajectory anonymity publishing, which is resulted by the attack, this paper aims at the trajectory anonymity algorithm of division of region. According to the start stop time of the trajectory, the current sensitive region is found with the k-anonymity set on the synchronous trajectory. If the distance between the divided sub-region and the adjacent anonymous area is not greater than the threshold d, the area will be combined. Otherwise, with the guidance of location mapping, the forged location is added to the sub-region according to the original location so that the divided sub-region can meet the principle of k-anonymity. While the forged location retains the relative position of each point in the sensitive region, making that the divided sub-region and the original Regional anonymity are consistent. Experiments show that compared with the existing trajectory anonymous algorithm and the synchronous trajectory data set with the same privacy, the algorithm is highly effective in both privacy protection and validity of data quality.
Sadeghi, Koosha, Banerjee, Ayan, Gupta, Sandeep K. S..  2019.  An Analytical Framework for Security-Tuning of Artificial Intelligence Applications Under Attack. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). :111—118.
Machine Learning (ML) algorithms, as the core technology in Artificial Intelligence (AI) applications, such as self-driving vehicles, make important decisions by performing a variety of data classification or prediction tasks. Attacks on data or algorithms in AI applications can lead to misclassification or misprediction, which can fail the applications. For each dataset separately, the parameters of ML algorithms should be tuned to reach a desirable classification or prediction accuracy. Typically, ML experts tune the parameters empirically, which can be time consuming and does not guarantee the optimal result. To this end, some research suggests an analytical approach to tune the ML parameters for maximum accuracy. However, none of the works consider the ML performance under attack in their tuning process. This paper proposes an analytical framework for tuning the ML parameters to be secure against attacks, while keeping its accuracy high. The framework finds the optimal set of parameters by defining a novel objective function, which takes into account the test results of both ML accuracy and its security against attacks. For validating the framework, an AI application is implemented to recognize whether a subject's eyes are open or closed, by applying k-Nearest Neighbors (kNN) algorithm on her Electroencephalogram (EEG) signals. In this application, the number of neighbors (k) and the distance metric type, as the two main parameters of kNN, are chosen for tuning. The input data perturbation attack, as one of the most common attacks on ML algorithms, is used for testing the security of the application. Exhaustive search approach is used to solve the optimization problem. The experiment results show k = 43 and cosine distance metric is the optimal configuration of kNN for the EEG dataset, which leads to 83.75% classification accuracy and reduces the attack success rate to 5.21%.
2020-07-20
Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun.  2018.  Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.
Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.
2020-07-03
Usama, Muhammad, Asim, Muhammad, Qadir, Junaid, Al-Fuqaha, Ala, Imran, Muhammad Ali.  2019.  Adversarial Machine Learning Attack on Modulation Classification. 2019 UK/ China Emerging Technologies (UCET). :1—4.

Modulation classification is an important component of cognitive self-driving networks. Recently many ML-based modulation classification methods have been proposed. We have evaluated the robustness of 9 ML-based modulation classifiers against the powerful Carlini & Wagner (C-W) attack and showed that the current ML-based modulation classifiers do not provide any deterrence against adversarial ML examples. To the best of our knowledge, we are the first to report the results of the application of the C-W attack for creating adversarial examples against various ML models for modulation classification.

Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin.  2019.  Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.
2020-06-22
Tong, Dong, Yong, Zeng, Mengli, Liu, Zhihong, Liu, Jianfeng, Ma, Xiaoyan, Zhu.  2019.  A Topology Based Differential Privacy Scheme for Average Path Length Query. 2019 International Conference on Networking and Network Applications (NaNA). :350–355.
Differential privacy is heavily used in privacy protection due to it provides strong protection against private data. The existing differential privacy scheme mainly implements the privacy protection of nodes or edges in the network by perturbing the data query results. Most of them cannot meet the privacy protection requirements of multiple types of information. In order to overcome these issues, a differential privacy security mechanism with average path length (APL) query is proposed in this paper, which realize the privacy protection of both network vertices and edge weights. Firstly, by describing APL, the reasons for choosing this attribute as the query function are analyzed. Secondly, global sensitivity of APL query under the need of node privacy protection and edge-weighted privacy protection is proved. Finally, the relationship between data availability and privacy control parameters in differential privacy is analyzed through experiments.
2020-06-19
Wang, Si, Liu, Wenye, Chang, Chip-Hong.  2019.  Detecting Adversarial Examples for Deep Neural Networks via Layer Directed Discriminative Noise Injection. 2019 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). :1—6.

Deep learning is a popular powerful machine learning solution to the computer vision tasks. The most criticized vulnerability of deep learning is its poor tolerance towards adversarial images obtained by deliberately adding imperceptibly small perturbations to the clean inputs. Such negatives can delude a classifier into wrong decision making. Previous defensive techniques mostly focused on refining the models or input transformation. They are either implemented only with small datasets or shown to have limited success. Furthermore, they are rarely scrutinized from the hardware perspective despite Artificial Intelligence (AI) on a chip is a roadmap for embedded intelligence everywhere. In this paper we propose a new discriminative noise injection strategy to adaptively select a few dominant layers and progressively discriminate adversarial from benign inputs. This is made possible by evaluating the differences in label change rate from both adversarial and natural images by injecting different amount of noise into the weights of individual layers in the model. The approach is evaluated on the ImageNet Dataset with 8-bit truncated models for the state-of-the-art DNN architectures. The results show a high detection rate of up to 88.00% with only approximately 5% of false positive rate for MobileNet. Both detection rate and false positive rate have been improved well above existing advanced defenses against the most practical noninvasive universal perturbation attack on deep learning based AI chip.

2020-04-20
Wang, Chong Xiao, Song, Yang, Tay, Wee Peng.  2018.  PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.
We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.
Wang, Chong Xiao, Song, Yang, Tay, Wee Peng.  2018.  PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.
We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.
2020-04-17
Jang, Yunseok, Zhao, Tianchen, Hong, Seunghoon, Lee, Honglak.  2019.  Adversarial Defense via Learning to Generate Diverse Attacks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :2740—2749.

With the remarkable success of deep learning, Deep Neural Networks (DNNs) have been applied as dominant tools to various machine learning domains. Despite this success, however, it has been found that DNNs are surprisingly vulnerable to malicious attacks; adding a small, perceptually indistinguishable perturbations to the data can easily degrade classification performance. Adversarial training is an effective defense strategy to train a robust classifier. In this work, we propose to utilize the generator to learn how to create adversarial examples. Unlike the existing approaches that create a one-shot perturbation by a deterministic generator, we propose a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier. Our experiment results on MNIST and CIFAR-10 datasets show that the classifier adversarially trained with our method yields more robust performance over various white-box and black-box attacks.

2020-04-03
Gerking, Christopher, Schubert, David.  2019.  Component-Based Refinement and Verification of Information-Flow Security Policies for Cyber-Physical Microservice Architectures. 2019 IEEE International Conference on Software Architecture (ICSA). :61—70.

Since cyber-physical systems are inherently vulnerable to information leaks, software architects need to reason about security policies to define desired and undesired information flow through a system. The microservice architectural style requires the architects to refine a macro-level security policy into micro-level policies for individual microservices. However, when policies are refined in an ill-formed way, information leaks can emerge on composition of microservices. Related approaches to prevent such leaks do not take into account characteristics of cyber-physical systems like real-time behavior or message passing communication. In this paper, we enable the refinement and verification of information-flow security policies for cyber-physical microservice architectures. We provide architects with a set of well-formedness rules for refining a macro-level policy in a way that enforces its security restrictions. Based on the resulting micro-level policies, we present a verification technique to check if the real-time message passing of microservices is secure. In combination, our contributions prevent information leaks from emerging on composition. We evaluate the accuracy of our approach using an extension of the CoCoME case study.

2020-02-18
Huang, Yonghong, Verma, Utkarsh, Fralick, Celeste, Infantec-Lopez, Gabriel, Kumar, Brajesh, Woodward, Carl.  2019.  Malware Evasion Attack and Defense. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :34–38.

Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.

Chen, Jiefeng, Wu, Xi, Rastogi, Vaibhav, Liang, Yingyu, Jha, Somesh.  2019.  Towards Understanding Limitations of Pixel Discretization Against Adversarial Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :480–495.

Wide adoption of artificial neural networks in various domains has led to an increasing interest in defending adversarial attacks against them. Preprocessing defense methods such as pixel discretization are particularly attractive in practice due to their simplicity, low computational overhead, and applicability to various systems. It is observed that such methods work well on simple datasets like MNIST, but break on more complicated ones like ImageNet under recently proposed strong white-box attacks. To understand the conditions for success and potentials for improvement, we study the pixel discretization defense method, including more sophisticated variants that take into account the properties of the dataset being discretized. Our results again show poor resistance against the strong attacks. We analyze our results in a theoretical framework and offer strong evidence that pixel discretization is unlikely to work on all but the simplest of the datasets. Furthermore, our arguments present insights why some other preprocessing defenses may be insecure.

2020-01-13
Shen, Yitong, Wang, Lingfeng, Lau, Jim Pikkin, Liu, Zhaoxi.  2019.  A Robust Control Architecture for Mitigating Sensor and Actuator Attacks on PV Converter. 2019 IEEE PES GTD Grand International Conference and Exposition Asia (GTD Asia). :970–975.
The cybersecurity of the modern control system is becoming a critical issue to the cyber-physical systems (CPS). Mitigating potential cyberattacks in the control system is an important concern in the controller design to enhance the resilience of the overall system. This paper presents a novel robust control architecture for the PV converter system to mitigate the sensor and actuator attack and reduce the influence of the system uncertainty. The sensor and actuator attack is a vicious attack scenario when the attack signals are injected into the sensor and actuator in a CPS simultaneously. A p-synthesis robust control architecture is proposed to mitigate the sensor and actuator attack and limit the system uncertainty perturbations in a DC-DC photovoltaic (PV) converter. A new system state matrix and control architecture is presented by integrating the original system state, injected attack signals and system uncertainty perturbations. In the case study, the proposed μ-synthesis robust controller exhibits a robust performance in the face of the sensor and actuator attack.
2019-01-31
Nakamura, T., Nishi, H..  2018.  TMk-Anonymity: Perturbation-Based Data Anonymization Method for Improving Effectiveness of Secondary Use. IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society. :3138–3143.

The recent emergence of smartphones, cloud computing, and the Internet of Things has brought about the explosion of data creation. By collating and merging these enormous data with other information, services that use information become more sophisticated and advanced. However, at the same time, the consideration of privacy violations caused by such merging is indispensable. Various anonymization methods have been proposed to preserve privacy. The conventional perturbation-based anonymization method of location data adds comparatively larger noise, and the larger noise makes it difficult to utilize the data effectively for secondary use. In this research, to solve these problems, we first clarified the definition of privacy preservation and then propose TMk-anonymity according to the definition.

2019-01-16
Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J..  2018.  Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. :1778–1787.
Neural networks are vulnerable to adversarial examples, which poses a threat to their application in security sensitive systems. We propose high-level representation guided denoiser (HGD) as a defense for image classification. Standard denoiser suffers from the error amplification effect, in which small residual adversarial noise is progressively amplified and leads to wrong classifications. HGD overcomes this problem by using a loss function defined as the difference between the target model's outputs activated by the clean image and denoised image. Compared with ensemble adversarial training which is the state-of-the-art defending method on large images, HGD has three advantages. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other images and unseen classes. Third, HGD can be transferred to defend models other than the one guiding it. In NIPS competition on defense against adversarial attacks, our HGD solution won the first place and outperformed other models by a large margin.1
Kreuk, F., Adi, Y., Cisse, M., Keshet, J..  2018.  Fooling End-To-End Speaker Verification With Adversarial Examples. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1962–1966.
Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.
Carlini, N., Wagner, D..  2018.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. 2018 IEEE Security and Privacy Workshops (SPW). :1–7.
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.
Gao, J., Lanchantin, J., Soffa, M. L., Qi, Y..  2018.  Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. 2018 IEEE Security and Privacy Workshops (SPW). :50–56.

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB. Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.