Title | Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks |
Publication Type | Web Article |
Year of Publication | 2021 |
Authors | Lin, Weiran, Lucas, Keane, Bauer, Lujo, Reiter, Michael K., Sharif, Mahmood |
Keywords | 2021: January, 2022: January, CMU |
Abstract | Minimal adversarial perturbations added to inputs have been shown to be effective at fooling deep neural networks. In this paper, we introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal: to trick the model to assign a higher probability to the target class than to any other, while staying within a specified distance from the original input. First, we propose a new loss function that explicitly captures the goal of targeted attacks, in particular, by using the logits of all classes instead of just a subset, as is common. We show that Auto-PGD with this loss function finds more adversarial examples than it does with other commonly used loss functions. Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the L∞ distance limit ϵ. This new attack method is relatively 1.5--4.2% more successful on the CIFAR10 dataset and relatively 8.2--14.9% more successful on the ImageNet dataset, than the next best state-of-the-art attack. We confirm using statistical tests that our attack outperforms state-of-the-art attacks on different datasets and values of ϵ and against different defenses. |
Citation Key | node-81228 |