Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Title | Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Gao, J., Lanchantin, J., Soffa, M. L., Qi, Y. |
Conference Name | 2018 IEEE Security and Privacy Workshops (SPW) |
Keywords | adversarial samples, adversarial text sequences, black box attack, black-box attack, black-box generation, character-level transformations, composability, Deep Learning, deep learning classifiers, DeepWordBug, Enron spam emails, IMDB movie reviews, learning (artificial intelligence), machine learning, Metrics, misclassification, pattern classification, Perturbation methods, Prediction algorithms, program debugging, pubcrawl, real-world text datasets, Recurrent neural networks, resilience, scoring strategies, sentiment analysis, Task Analysis, text analysis, text classification, text input, text perturbations, White Box Security, word embedding |
Abstract | Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB. Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models. |
URL | https://ieeexplore.ieee.org/document/8424632 |
DOI | 10.1109/SPW.2018.00016 |
Citation Key | gao_black-box_2018 |
- sentiment analysis
- Perturbation methods
- Prediction algorithms
- program debugging
- pubcrawl
- real-world text datasets
- Recurrent neural networks
- resilience
- scoring strategies
- pattern classification
- Task Analysis
- text analysis
- text classification
- text input
- text perturbations
- White Box Security
- word embedding
- adversarial samples
- misclassification
- Metrics
- machine learning
- learning (artificial intelligence)
- IMDB movie reviews
- Enron spam emails
- DeepWordBug
- deep learning classifiers
- deep learning
- composability
- character-level transformations
- black-box generation
- black-box attack
- black box attack
- adversarial text sequences