Biblio
Deep neural networks (DNNs) provide good performance for image recognition, speech recognition, and pattern recognition. However, a poisoning attack is a serious threat to DNN's security. The poisoning attack is a method to reduce the accuracy of DNN by adding malicious training data during DNN training process. In some situations such as a military, it may be necessary to drop only a chosen class of accuracy in the model. For example, if an attacker does not allow only nuclear facilities to be selectively recognized, it may be necessary to intentionally prevent UAV from correctly recognizing nuclear-related facilities. In this paper, we propose a selective poisoning attack that reduces the accuracy of only chosen class in the model. The proposed method reduces the accuracy of a chosen class in the model by training malicious training data corresponding to a chosen class, while maintaining the accuracy of the remaining classes. For experiment, we used tensorflow as a machine learning library and MNIST and CIFAR10 as datasets. Experimental results show that the proposed method can reduce the accuracy of the chosen class to 43.2% and 55.3% in MNIST and CIFAR10, while maintaining the accuracy of the remaining classes.
Wide adoption of artificial neural networks in various domains has led to an increasing interest in defending adversarial attacks against them. Preprocessing defense methods such as pixel discretization are particularly attractive in practice due to their simplicity, low computational overhead, and applicability to various systems. It is observed that such methods work well on simple datasets like MNIST, but break on more complicated ones like ImageNet under recently proposed strong white-box attacks. To understand the conditions for success and potentials for improvement, we study the pixel discretization defense method, including more sophisticated variants that take into account the properties of the dataset being discretized. Our results again show poor resistance against the strong attacks. We analyze our results in a theoretical framework and offer strong evidence that pixel discretization is unlikely to work on all but the simplest of the datasets. Furthermore, our arguments present insights why some other preprocessing defenses may be insecure.