Biblio

List
Filter

Found 17 results

Filters: Keyword is adversarial attack [Clear All Filters]

2023-06-22

Jamil, Huma, Liu, Yajing, Cole, Christina, Blanchard, Nathaniel, King, Emily J., Kirby, Michael, Peterson, Christopher. 2022. Dual Graphs of Polyhedral Decompositions for the Detection of Adversarial Attacks. 2022 IEEE International Conference on Big Data (Big Data). :2913–2921.

Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit (1 for ReLU activation, 0 for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.

2023-03-31

Wu, Xiaoliang, Rajan, Ajitha. 2022. Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking. 2022 29th Asia-Pacific Software Engineering Conference (APSEC). :169–178.

Automatic speech recognition (ASR) models are used widely in applications for voice navigation and voice control of domestic appliances. ASRs have been misused by attackers to generate malicious outputs by attacking the deep learning component within ASRs. To assess the security and robustnesss of ASRs, we propose techniques within our framework SPAT that generate blackbox (agnostic to the DNN) adversarial attacks that are portable across ASRs. This is in contrast to existing work that focuses on whitebox attacks that are time consuming and lack portability. Our techniques generate adversarial attacks that have no human audible difference by manipulating the input speech signal using a psychoacoustic model that maintains the audio perturbations below the thresholds of human perception. We propose a framework SPAT with three attack generation techniques based on the psychoacoustic concept and frame selection techniques to selectively target the attack. We evaluate portability and effectiveness of our techniques using three popular ASRs and two input audio datasets using the metrics- Word Error Rate (WER) of output transcription, Similarity to original audio, attack Success Rate on different ASRs and Detection score by a defense system. We found our adversarial attacks were portable across ASRs, not easily detected by a state-of the-art defense system, and had significant difference in output transcriptions while sounding similar to original audio.

2022-12-20

Rakin, Adnan Siraj, Chowdhuryy, Md Hafizul Islam, Yao, Fan, Fan, Deliang. 2022. DeepSteal: Advanced Model Extractions Leveraging Efficient Weight Stealing in Memories. 2022 IEEE Symposium on Security and Privacy (SP). :1157–1174.

Recent advancements in Deep Neural Networks (DNNs) have enabled widespread deployment in multiple security-sensitive domains. The need for resource-intensive training and the use of valuable domain-specific training data have made these models the top intellectual property (IP) for model owners. One of the major threats to DNN privacy is model extraction attacks where adversaries attempt to steal sensitive information in DNN models. In this work, we propose an advanced model extraction framework DeepSteal that steals DNN weights remotely for the first time with the aid of a memory side-channel attack. Our proposed DeepSteal comprises two key stages. Firstly, we develop a new weight bit information extraction method, called HammerLeak, through adopting the rowhammer-based fault technique as the information leakage vector. HammerLeak leverages several novel system-level techniques tailored for DNN applications to enable fast and efficient weight stealing. Secondly, we propose a novel substitute model training algorithm with Mean Clustering weight penalty, which leverages the partial leaked bit information effectively and generates a substitute prototype of the target victim model. We evaluate the proposed model extraction framework on three popular image datasets (e.g., CIFAR-10/100/GTSRB) and four DNN architectures (e.g., ResNet-18/34/Wide-ResNetNGG-11). The extracted substitute model has successfully achieved more than 90% test accuracy on deep residual networks for the CIFAR-10 dataset. Moreover, our extracted substitute model could also generate effective adversarial input samples to fool the victim model. Notably, it achieves similar performance (i.e., 1-2% test accuracy under attack) as white-box adversarial input attack (e.g., PGD/Trades).

ISSN: 2375-1207

Liu, Xiaolei, Li, Xiaoyu, Zheng, Desheng, Bai, Jiayu, Peng, Yu, Zhang, Shibin. 2022. Automatic Selection Attacks Framework for Hard Label Black-Box Models. IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :1–7.

The current adversarial attacks against machine learning models can be divided into white-box attacks and black-box attacks. Further the black-box can be subdivided into soft label and hard label black-box, but the latter has the deficiency of only returning the class with the highest prediction probability, which leads to the difficulty in gradient estimation. However, due to its wide application, it is of great research significance and application value to explore hard label blackbox attacks. This paper proposes an Automatic Selection Attacks Framework (ASAF) for hard label black-box models, which can be explained in two aspects based on the existing attack methods. Firstly, ASAF applies model equivalence to select substitute models automatically so as to generate adversarial examples and then completes black-box attacks based on their transferability. Secondly, specified feature selection and parallel attack method are proposed to shorten the attack time and improve the attack success rate. The experimental results show that ASAF can achieve more than 90% success rate of nontargeted attack on the common models of traditional dataset ResNet-101 (CIFAR10) and InceptionV4 (ImageNet). Meanwhile, compared with FGSM and other attack algorithms, the attack time is reduced by at least 89.7% and 87.8% respectively in two traditional datasets. Besides, it can achieve 90% success rate of attack on the online model, BaiduAI digital recognition. In conclusion, ASAF is the first automatic selection attacks framework for hard label blackbox models, in which specified feature selection and parallel attack methods speed up automatic attacks.

2022-09-20

Chen, Tong, Xiang, Yingxiao, Li, Yike, Tian, Yunzhe, Tong, Endong, Niu, Wenjia, Liu, Jiqiang, Li, Gang, Alfred Chen, Qi. 2021. Protecting Reward Function of Reinforcement Learning via Minimal and Non-catastrophic Adversarial Trajectory. 2021 40th International Symposium on Reliable Distributed Systems (SRDS). :299—309.

Reward functions are critical hyperparameters with commercial values for individual or distributed reinforcement learning (RL), as slightly different reward functions result in significantly different performance. However, existing inverse reinforcement learning (IRL) methods can be utilized to approximate reward functions just based on collected expert trajectories through observing. Thus, in the real RL process, how to generate a polluted trajectory and perform an adversarial attack on IRL for protecting reward functions has become the key issue. Meanwhile, considering the actual RL cost, generated adversarial trajectories should be minimal and non-catastrophic for ensuring normal RL performance. In this work, we propose a novel approach to craft adversarial trajectories disguised as expert ones, for decreasing the IRL performance and realize the anti-IRL ability. Firstly, we design a reward clustering-based metric to integrate both advantages of fine- and coarse-grained IRL assessment, including expected value difference (EVD) and mean reward loss (MRL). Further, based on such metric, we explore an adversarial attack based on agglomerative nesting algorithm (AGNES) clustering and determine targeted states as starting states for reward perturbation. Then we employ the intrinsic fear model to predict the probability of imminent catastrophe, supporting to generate non-catastrophic adversarial trajectories. Extensive experiments of 7 state-of-the-art IRL algorithms are implemented on the Object World benchmark, demonstrating the capability of our proposed approach in (a) decreasing the IRL performance and (b) having minimal and non-catastrophic adversarial trajectories.

2022-01-31

Wang, Xiying, Ni, Rongrong, Li, Wenjie, Zhao, Yao. 2021. Adversarial Attack on Fake-Faces Detectors Under White and Black Box Scenarios. 2021 IEEE International Conference on Image Processing (ICIP). :3627–3631.

Generative Adversarial Network (GAN) models have been widely used in various fields. More recently, styleGAN and styleGAN2 have been developed to synthesize faces that are indistinguishable to the human eyes, which could pose a threat to public security. But latest work has shown that it is possible to identify fakes using powerful CNN networks as classifiers. However, the reliability of these techniques is unknown. Therefore, in this paper we focus on the generation of content-preserving images from fake faces to spoof classifiers. Two GAN-based frameworks are proposed to achieve the goal in the white-box and black-box. For the white-box, a network without up/down sampling is proposed to generate face images to confuse the classifier. In the black-box scenario (where the classifier is unknown), real data is introduced as a guidance for GAN structure to make it adversarial, and a Real Extractor as an auxiliary network to constrain the feature distance between the generated images and the real data to enhance the adversarial capability. Experimental results show that the proposed method effectively reduces the detection accuracy of forensic models with good transferability.

2022-01-10

Ngo, Quoc-Dung, Nguyen, Huy-Trung, Nguyen, Viet-Dung, Dinh, Cong-Minh, Phung, Anh-Tu, Bui, Quy-Tung. 2021. Adversarial Attack and Defense on Graph-based IoT Botnet Detection Approach. 2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE). :1–6.

To reduce the risk of botnet malware, methods of detecting botnet malware using machine learning have received enormous attention in recent years. Most of the traditional methods are based on supervised learning that relies on static features with defined labels. However, recent studies show that supervised machine learning-based IoT malware botnet models are more vulnerable to intentional attacks, known as an adversarial attack. In this paper, we study the adversarial attack on PSI-graph based researches. To perform the efficient attack, we proposed a reinforcement learning based method with a trained target classifier to modify the structures of PSI-graphs. We show that PSI-graphs are vulnerable to such attack. We also discuss about defense method which uses adversarial training to train a defensive model. Experiment result achieves 94.1% accuracy on the adversarial dataset; thus, shows that our defensive model is much more robust than the previous target classifier.

2021-05-13

Li, Xu, Zhong, Jinghua, Wu, Xixin, Yu, Jianwei, Liu, Xunying, Meng, Helen. 2020. Adversarial Attacks on GMM I-Vector Based Speaker Verification Systems. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :6579—6583.

This work investigates the vulnerability of Gaussian Mixture Model (GMM) i-vector based speaker verification systems to adversarial attacks, and the transferability of adversarial samples crafted from GMM i-vector based systems to x-vector based systems. In detail, we formulate the GMM i-vector system as a scoring function of enrollment and testing utterance pairs. Then we leverage the fast gradient sign method (FGSM) to optimize testing utterances for adversarial samples generation. These adversarial samples are used to attack both GMM i-vector and x-vector systems. We measure the system vulnerability by the degradation of equal error rate and false acceptance rate. Experiment results show that GMM i-vector systems are seriously vulnerable to adversarial attacks, and the crafted adversarial samples are proved to be transferable and pose threats to neural network speaker embedding based systems (e.g. x-vector systems).

2021-03-01

Kuppa, A., Le-Khac, N.-A.. 2020. Black Box Attacks on Explainable Artificial Intelligence(XAI) methods in Cyber Security. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.

Cybersecurity community is slowly leveraging Machine Learning (ML) to combat ever evolving threats. One of the biggest drivers for successful adoption of these models is how well domain experts and users are able to understand and trust their functionality. As these black-box models are being employed to make important predictions, the demand for transparency and explainability is increasing from the stakeholders.Explanations supporting the output of ML models are crucial in cyber security, where experts require far more information from the model than a simple binary output for their analysis. Recent approaches in the literature have focused on three different areas: (a) creating and improving explainability methods which help users better understand the internal workings of ML models and their outputs; (b) attacks on interpreters in white box setting; (c) defining the exact properties and metrics of the explanations generated by models. However, they have not covered, the security properties and threat models relevant to cybersecurity domain, and attacks on explainable models in black box settings.In this paper, we bridge this gap by proposing a taxonomy for Explainable Artificial Intelligence (XAI) methods, covering various security properties and threat models relevant to cyber security domain. We design a novel black box attack for analyzing the consistency, correctness and confidence security properties of gradient based XAI methods. We validate our proposed system on 3 security-relevant data-sets and models, and demonstrate that the method achieves attacker's goal of misleading both the classifier and explanation report and, only explainability method without affecting the classifier output. Our evaluation of the proposed approach shows promising results and can help in designing secure and robust XAI methods.

2021-01-25

Chen, J., Lin, X., Shi, Z., Liu, Y.. 2020. Link Prediction Adversarial Attack Via Iterative Gradient Attack. IEEE Transactions on Computational Social Systems. 7:1081–1094.

Increasing deep neural networks are applied in solving graph evolved tasks, such as node classification and link prediction. However, the vulnerability of deep models can be revealed using carefully crafted adversarial examples generated by various adversarial attack methods. To explore this security problem, we define the link prediction adversarial attack problem and put forward a novel iterative gradient attack (IGA) strategy using the gradient information in the trained graph autoencoder (GAE) model. Not surprisingly, GAE can be fooled by an adversarial graph with a few links perturbed on the clean one. The results on comprehensive experiments of different real-world graphs indicate that most deep models and even the state-of-the-art link prediction algorithms cannot escape the adversarial attack, such as GAE. We can benefit the attack as an efficient privacy protection tool from the link prediction of unknown violations. On the other hand, the adversarial attack is a robust evaluation metric for current link prediction algorithms of their defensibility.

2020-11-04

Shen, J., Zhu, X., Ma, D.. 2019. TensorClog: An Imperceptible Poisoning Attack on Deep Neural Network Applications. IEEE Access. 7:41498—41506.

Internet application providers now have more incentive than ever to collect user data, which greatly increases the risk of user privacy violations due to the emerging of deep neural networks. In this paper, we propose TensorClog-a poisoning attack technique that is designed for privacy protection against deep neural networks. TensorClog has three properties with each of them serving a privacy protection purpose: 1) training on TensorClog poisoned data results in lower inference accuracy, reducing the incentive of abusive data collection; 2) training on TensorClog poisoned data converges to a larger loss, which prevents the neural network from learning the privacy; and 3) TensorClog regularizes the perturbation to remain a high structure similarity, so that the poisoning does not affect the actual content in the data. Applying our TensorClog poisoning technique to CIFAR-10 dataset results in an increase in both converged training loss and test error by 300% and 272%, respectively. It manages to maintain data's human perception with a high SSIM index of 0.9905. More experiments including different limited information attack scenarios and a real-world application transferred from pre-trained ImageNet models are presented to further evaluate TensorClog's effectiveness in more complex situations.

2020-10-29

Vi, Bao Ngoc, Noi Nguyen, Huu, Nguyen, Ngoc Tran, Truong Tran, Cao. 2019. Adversarial Examples Against Image-based Malware Classification Systems. 2019 11th International Conference on Knowledge and Systems Engineering (KSE). :1—5.

Malicious software, known as malware, has become urgently serious threat for computer security, so automatic mal-ware classification techniques have received increasing attention. In recent years, deep learning (DL) techniques for computer vision have been successfully applied for malware classification by visualizing malware files and then using DL to classify visualized images. Although DL-based classification systems have been proven to be much more accurate than conventional ones, these systems have been shown to be vulnerable to adversarial attacks. However, there has been little research to consider the danger of adversarial attacks to visualized image-based malware classification systems. This paper proposes an adversarial attack method based on the gradient to attack image-based malware classification systems by introducing perturbations on resource section of PE files. The experimental results on the Malimg dataset show that by a small interference, the proposed method can achieve success attack rate when challenging convolutional neural network malware classifiers.

2020-09-04

Taori, Rohan, Kamsetty, Amog, Chu, Brenton, Vemuri, Nikita. 2019. Targeted Adversarial Examples for Black Box Audio Systems. 2019 IEEE Security and Privacy Workshops (SPW). :15—20.

The application of deep recurrent networks to audio transcription has led to impressive gains in automatic speech recognition (ASR) systems. Many have demonstrated that small adversarial perturbations can fool deep neural networks into incorrectly predicting a specified target with high confidence. Current work on fooling ASR systems have focused on white-box attacks, in which the model architecture and parameters are known. In this paper, we adopt a black-box approach to adversarial generation, combining the approaches of both genetic algorithms and gradient estimation to solve the task. We achieve a 89.25% targeted attack similarity, with 35% targeted attack success rate, after 3000 generations while maintaining 94.6% audio file similarity.

2020-07-20

Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun. 2018. Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.

Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.

2020-07-03

Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin. 2019. Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.

In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.

2018-04-11

Chen, Lingwei, Hou, Shifu, Ye, Yanfang. 2017. SecureDroid: Enhancing Security of Machine Learning-Based Detection Against Adversarial Android Malware Attacks. Proceedings of the 33rd Annual Computer Security Applications Conference. :362–372.

With smart phones being indispensable in people's everyday life, Android malware has posed serious threats to their security, making its detection of utmost concern. To protect legitimate users from the evolving Android malware attacks, machine learning-based systems have been successfully deployed and offer unparalleled flexibility in automatic Android malware detection. In these systems, based on different feature representations, various kinds of classifiers are constructed to detect Android malware. Unfortunately, as classifiers become more widely deployed, the incentive for defeating them increases. In this paper, we explore the security of machine learning in Android malware detection on the basis of a learning-based classifier with the input of a set of features extracted from the Android applications (apps). We consider different importances of the features associated with their contributions to the classification problem as well as their manipulation costs, and present a novel feature selection method (named SecCLS) to make the classifier harder to be evaded. To improve the system security while not compromising the detection accuracy, we further propose an ensemble learning approach (named SecENS) by aggregating the individual classifiers that are constructed using our proposed feature selection method SecCLS. Accordingly, we develop a system called SecureDroid which integrates our proposed methods (i.e., SecCLS and SecENS) to enhance security of machine learning-based Android malware detection. Comprehensive experiments on the real sample collections from Comodo Cloud Security Center are conducted to validate the effectiveness of SecureDroid against adversarial Android malware attacks by comparisons with other alternative defense methods. Our proposed secure-learning paradigm can also be readily applied to other malware detection tasks.

2018-03-19

Showkatbakhsh, M., Shoukry, Y., Chen, R. H., Diggavi, S., Tabuada, P.. 2017. An SMT-Based Approach to Secure State Estimation under Sensor and Actuator Attacks. 2017 IEEE 56th Annual Conference on Decision and Control (CDC). :157–162.

This paper addresses the problem of state estimation of a linear time-invariant system when some of the sensors or/and actuators are under adversarial attack. In our set-up, the adversarial agent attacks a sensor (actuator) by manipulating its measurement (input), and we impose no constraint on how the measurements (inputs) are corrupted. We introduce the notion of ``sparse strong observability'' to characterize systems for which the state estimation is possible, given bounds on the number of attacked sensors and actuators. Furthermore, we develop a secure state estimator based on Satisfiability Modulo Theory (SMT) solvers.