Visible to the public Biblio

Filters: Keyword is black-box attacks  [Clear All Filters]
2022-12-20
Zhan, Yike, Zheng, Baolin, Wang, Qian, Mou, Ningping, Guo, Binqing, Li, Qi, Shen, Chao, Wang, Cong.  2022.  Towards Black-Box Adversarial Attacks on Interpretable Deep Learning Systems. 2022 IEEE International Conference on Multimedia and Expo (ICME). :1–6.
Recent works have empirically shown that neural network interpretability is susceptible to malicious manipulations. However, existing attacks against Interpretable Deep Learning Systems (IDLSes) all focus on the white-box setting, which is obviously unpractical in real-world scenarios. In this paper, we make the first attempt to attack IDLSes in the decision-based black-box setting. We propose a new framework called Dual Black-box Adversarial Attack (DBAA) which can generate adversarial examples that are misclassified as the target class, yet have very similar interpretations to their benign cases. We conduct comprehensive experiments on different combinations of classifiers and interpreters to illustrate the effectiveness of DBAA. Empirical results show that in all the cases, DBAA achieves high attack success rates and Intersection over Union (IoU) scores.
2022-11-18
Paudel, Bijay Raj, Itani, Aashish, Tragoudas, Spyros.  2021.  Resiliency of SNN on Black-Box Adversarial Attacks. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). :799–806.
Existing works indicate that Spiking Neural Networks (SNNs) are resilient to adversarial attacks by testing against few attack models. This paper studies adversarial attacks on SNNs using additional attack models and shows that SNNs are not inherently robust against many few-pixel L0 black-box attacks. Additionally, a method to defend against such attacks in SNNs is presented. The SNNs and the effects of adversarial attacks are tested on both software simulators as well as on SpiNNaker neuromorphic hardware.
2022-07-15
Fan, Wenqi, Derr, Tyler, Zhao, Xiangyu, Ma, Yao, Liu, Hui, Wang, Jianping, Tang, Jiliang, Li, Qing.  2021.  Attacking Black-box Recommendations via Copying Cross-domain User Profiles. 2021 IEEE 37th International Conference on Data Engineering (ICDE). :1583—1594.
Recommender systems, which aim to suggest personalized lists of items for users, have drawn a lot of attention. In fact, many of these state-of-the-art recommender systems have been built on deep neural networks (DNNs). Recent studies have shown that these deep neural networks are vulnerable to attacks, such as data poisoning, which generate fake users to promote a selected set of items. Correspondingly, effective defense strategies have been developed to detect these generated users with fake profiles. Thus, new strategies of creating more ‘realistic’ user profiles to promote a set of items should be investigated to further understand the vulnerability of DNNs based recommender systems. In this work, we present a novel framework CopyAttack. It is a reinforcement learning based black-box attacking method that harnesses real users from a source domain by copying their profiles into the target domain with the goal of promoting a subset of items. CopyAttack is constructed to both efficiently and effectively learn policy gradient networks that first select, then further refine/craft user profiles from the source domain, and ultimately copy them into the target domain. CopyAttack’s goal is to maximize the hit ratio of the targeted items in the Top-k recommendation list of the users in the target domain. We conducted experiments on two real-world datasets and empirically verified the effectiveness of the proposed framework. The implementation of CopyAttack is available at https://github.com/wenqifan03/CopyAttack.
2021-03-04
Crescenzo, G. D., Bahler, L., McIntosh, A..  2020.  Encrypted-Input Program Obfuscation: Simultaneous Security Against White-Box and Black-Box Attacks. 2020 IEEE Conference on Communications and Network Security (CNS). :1—9.

We consider the problem of protecting cloud services from simultaneous white-box and black-box attacks. Recent research in cryptographic program obfuscation considers the problem of protecting the confidentiality of programs and any secrets in them. In this model, a provable program obfuscation solution makes white-box attacks to the program not more useful than black-box attacks. Motivated by very recent results showing successful black-box attacks to machine learning programs run by cloud servers, we propose and study the approach of augmenting the program obfuscation solution model so to achieve, in at least some class of application scenarios, program confidentiality in the presence of both white-box and black-box attacks.We propose and formally define encrypted-input program obfuscation, where a key is shared between the entity obfuscating the program and the entity encrypting the program's inputs. We believe this model might be of interest in practical scenarios where cloud programs operate over encrypted data received by associated sensors (e.g., Internet of Things, Smart Grid).Under standard intractability assumptions, we show various results that are not known in the traditional cryptographic program obfuscation model; most notably: Yao's garbled circuit technique implies encrypted-input program obfuscation hiding all gates of an arbitrary polynomial circuit; and very efficient encrypted-input program obfuscation for range membership programs and a class of machine learning programs (i.e., decision trees). The performance of the latter solutions has only a small constant overhead over the equivalent unobfuscated program.

2020-09-04
Zhao, Pu, Liu, Sijia, Chen, Pin-Yu, Hoang, Nghia, Xu, Kaidi, Kailkhura, Bhavya, Lin, Xue.  2019.  On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :121—130.
Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.
2020-04-17
Jang, Yunseok, Zhao, Tianchen, Hong, Seunghoon, Lee, Honglak.  2019.  Adversarial Defense via Learning to Generate Diverse Attacks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :2740—2749.

With the remarkable success of deep learning, Deep Neural Networks (DNNs) have been applied as dominant tools to various machine learning domains. Despite this success, however, it has been found that DNNs are surprisingly vulnerable to malicious attacks; adding a small, perceptually indistinguishable perturbations to the data can easily degrade classification performance. Adversarial training is an effective defense strategy to train a robust classifier. In this work, we propose to utilize the generator to learn how to create adversarial examples. Unlike the existing approaches that create a one-shot perturbation by a deterministic generator, we propose a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier. Our experiment results on MNIST and CIFAR-10 datasets show that the classifier adversarially trained with our method yields more robust performance over various white-box and black-box attacks.

2020-02-18
Huang, Yonghong, Verma, Utkarsh, Fralick, Celeste, Infantec-Lopez, Gabriel, Kumar, Brajesh, Woodward, Carl.  2019.  Malware Evasion Attack and Defense. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :34–38.

Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.

Nasr, Milad, Shokri, Reza, Houmansadr, Amir.  2019.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-Box Inference Attacks against Centralized and Federated Learning. 2019 IEEE Symposium on Security and Privacy (SP). :739–753.

Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

2019-01-16
Kreuk, F., Adi, Y., Cisse, M., Keshet, J..  2018.  Fooling End-To-End Speaker Verification With Adversarial Examples. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1962–1966.
Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.