Biblio
Deep neural networks (DNNs) provide good performance for image recognition, speech recognition, and pattern recognition. However, a poisoning attack is a serious threat to DNN's security. The poisoning attack is a method to reduce the accuracy of DNN by adding malicious training data during DNN training process. In some situations such as a military, it may be necessary to drop only a chosen class of accuracy in the model. For example, if an attacker does not allow only nuclear facilities to be selectively recognized, it may be necessary to intentionally prevent UAV from correctly recognizing nuclear-related facilities. In this paper, we propose a selective poisoning attack that reduces the accuracy of only chosen class in the model. The proposed method reduces the accuracy of a chosen class in the model by training malicious training data corresponding to a chosen class, while maintaining the accuracy of the remaining classes. For experiment, we used tensorflow as a machine learning library and MNIST and CIFAR10 as datasets. Experimental results show that the proposed method can reduce the accuracy of the chosen class to 43.2% and 55.3% in MNIST and CIFAR10, while maintaining the accuracy of the remaining classes.
Deep neural networks (DNNs) exhibit excellent performance in machine learning tasks such as image recognition, pattern recognition, speech recognition, and intrusion detection. However, the usage of adversarial examples, which are intentionally corrupted by noise, can lead to misclassification. As adversarial examples are serious threats to DNNs, both adversarial attacks and methods of defending against adversarial examples have been continuously studied. Zero-day adversarial examples are created with new test data and are unknown to the classifier; hence, they represent a more significant threat to DNNs. To the best of our knowledge, there are no analytical studies in the literature of zero-day adversarial examples with a focus on attack and defense methods through experiments using several scenarios. Therefore, in this study, zero-day adversarial examples are practically analyzed with an emphasis on attack and defense methods through experiments using various scenarios composed of a fixed target model and an adaptive target model. The Carlini method was used for a state-of-the-art attack, while an adversarial training method was used as a typical defense method. We used the MNIST dataset and analyzed success rates of zero-day adversarial examples, average distortions, and recognition of original samples through several scenarios of fixed and adaptive target models. Experimental results demonstrate that changing the parameters of the target model in real time leads to resistance to adversarial examples in both the fixed and adaptive target models.