Visible to the public Biblio

Filters: Keyword is adversarial defense  [Clear All Filters]
2022-10-16
MaungMaung, AprilPyone, Kiya, Hitoshi.  2021.  Ensemble of Key-Based Models: Defense Against Black-Box Adversarial Attacks. 2021 IEEE 10th Global Conference on Consumer Electronics (GCCE). :95–98.
We propose a voting ensemble of models trained by using block-wise transformed images with secret keys against black-box attacks. Although key-based adversarial defenses were effective against gradient-based (white-box) attacks, they cannot defend against gradient-free (black-box) attacks without requiring any secret keys. In the proposed ensemble, a number of models are trained by using images transformed with different keys and block sizes, and then a voting ensemble is applied to the models. Experimental results show that the proposed defense achieves a clean accuracy of 95.56 % and an attack success rate of less than 9 % under attacks with a noise distance of 8/255 on the CIFAR-10 dataset.
2021-10-04
Liu, Yuan, Zhou, Pingqiang.  2020.  Defending Against Adversarial Attacks in Deep Learning with Robust Auxiliary Classifiers Utilizing Bit Plane Slicing. 2020 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). :1–4.
Deep Neural Networks (DNNs) have been widely used in variety of fields with great success. However, recent researches indicate that DNNs are susceptible to adversarial attacks, which can easily fool the well-trained DNNs without being detected by human eyes. In this paper, we propose to combine the target DNN model with robust bit plane classifiers to defend against adversarial attacks. It comes from our finding that successful attacks generate imperceptible perturbations, which mainly affects the low-order bits of pixel value in clean images. Hence, using bit planes instead of traditional RGB channels for convolution can effectively reduce channel modification rate. We conduct experiments on dataset CIFAR-10 and GTSRB. The results show that our defense method can effectively increase the model accuracy on average from 8.72% to 85.99% under attacks on CIFAR-10 without sacrificina accuracy of clean images.
2021-05-20
Maung, Maung, Pyone, April, Kiya, Hitoshi.  2020.  Encryption Inspired Adversarial Defense For Visual Classification. 2020 IEEE International Conference on Image Processing (ICIP). :1681—1685.
Conventional adversarial defenses reduce classification accuracy whether or not a model is under attacks. Moreover, most of image processing based defenses are defeated due to the problem of obfuscated gradients. In this paper, we propose a new adversarial defense which is a defensive transform for both training and test images inspired by perceptual image encryption methods. The proposed method utilizes a block-wise pixel shuffling method with a secret key. The experiments are carried out on both adaptive and non-adaptive maximum-norm bounded white-box attacks while considering obfuscated gradients. The results show that the proposed defense achieves high accuracy (91.55%) on clean images and (89.66%) on adversarial examples with noise distance of 8/255 on CFAR-10 dataset. Thus, the proposed defense outperforms state-of-the-art adversarial defenses including latent adversarial training, adversarial training and thermometer encoding.
2020-12-28
Raju, R. S., Lipasti, M..  2020.  BlurNet: Defense by Filtering the Feature Maps. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :38—46.

Recently, the field of adversarial machine learning has been garnering attention by showing that state-of-the-art deep neural networks are vulnerable to adversarial examples, stemming from small perturbations being added to the input image. Adversarial examples are generated by a malicious adversary by obtaining access to the model parameters, such as gradient information, to alter the input or by attacking a substitute model and transferring those malicious examples over to attack the victim model. Specifically, one of these attack algorithms, Robust Physical Perturbations (RP2), generates adversarial images of stop signs with black and white stickers to achieve high targeted misclassification rates against standard-architecture traffic sign classifiers. In this paper, we propose BlurNet, a defense against the RP2 attack. First, we motivate the defense with a frequency analysis of the first layer feature maps of the network on the LISA dataset, which shows that high frequency noise is introduced into the input image by the RP2 algorithm. To remove the high frequency noise, we introduce a depthwise convolution layer of standard blur kernels after the first layer. We perform a blackbox transfer attack to show that low-pass filtering the feature maps is more beneficial than filtering the input. We then present various regularization schemes to incorporate this lowpass filtering behavior into the training regime of the network and perform white-box attacks. We conclude with an adaptive attack evaluation to show that the success rate of the attack drops from 90% to 20% with total variation regularization, one of the proposed defenses.

2020-04-17
Jang, Yunseok, Zhao, Tianchen, Hong, Seunghoon, Lee, Honglak.  2019.  Adversarial Defense via Learning to Generate Diverse Attacks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :2740—2749.

With the remarkable success of deep learning, Deep Neural Networks (DNNs) have been applied as dominant tools to various machine learning domains. Despite this success, however, it has been found that DNNs are surprisingly vulnerable to malicious attacks; adding a small, perceptually indistinguishable perturbations to the data can easily degrade classification performance. Adversarial training is an effective defense strategy to train a robust classifier. In this work, we propose to utilize the generator to learn how to create adversarial examples. Unlike the existing approaches that create a one-shot perturbation by a deterministic generator, we propose a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier. Our experiment results on MNIST and CIFAR-10 datasets show that the classifier adversarially trained with our method yields more robust performance over various white-box and black-box attacks.

2020-04-03
Song, Liwei, Shokri, Reza, Mittal, Prateek.  2019.  Membership Inference Attacks Against Adversarially Robust Deep Learning Models. 2019 IEEE Security and Privacy Workshops (SPW). :50—56.
In recent years, the research community has increasingly focused on understanding the security and privacy challenges posed by deep learning models. However, the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards enhancing our understanding of deep learning models when the two domains are combined together. We do this by measuring the success of membership inference attacks against two state-of-the-art adversarial defense methods that mitigate evasion attacks: adversarial training and provable defense. On the one hand, membership inference attacks aim to infer an individual's participation in the target model's training dataset and are known to be correlated with target model's overfitting. On the other hand, adversarial defense methods aim to enhance the robustness of target models by ensuring that model predictions are unchanged for a small area around each sample in the training dataset. Intuitively, adversarial defenses may rely more on the training dataset and be more vulnerable to membership inference attacks. By performing empirical membership inference attacks on both adversarially robust models and corresponding undefended models, we find that the adversarial training method is indeed more susceptible to membership inference attacks, and the privacy leakage is directly correlated with model robustness. We also find that the provable defense approach does not lead to enhanced success of membership inference attacks. However, this is achieved by significantly sacrificing the accuracy of the model on benign data points, indicating that privacy, security, and prediction accuracy are not jointly achieved in these two approaches.