Roy, Arunava, Dasgupta, Dipankar.
2022.
A Robust Framework for Adaptive Selection of Filter Ensembles to Detect Adversarial Inputs. 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :59—67.
Existing defense strategies against adversarial attacks (AAs) on AI/ML are primarily focused on examining the input data streams using a wide variety of filtering techniques. For instance, input filters are used to remove noisy, misleading, and out-of-class inputs along with a variety of attacks on learning systems. However, a single filter may not be able to detect all types of AAs. To address this issue, in the current work, we propose a robust, transferable, distribution-independent, and cross-domain supported framework for selecting Adaptive Filter Ensembles (AFEs) to minimize the impact of data poisoning on learning systems. The optimal filter ensembles are determined through a Multi-Objective Bi-Level Programming Problem (MOBLPP) that provides a subset of diverse filter sequences, each exhibiting fair detection accuracy. The proposed framework of AFE is trained to model the pristine data distribution to identify the corrupted inputs and converges to the optimal AFE without vanishing gradients and mode collapses irrespective of input data distributions. We presented preliminary experiments to show the proposed defense outperforms the existing defenses in terms of robustness and accuracy.
Erbil, Pinar, Gursoy, M. Emre.
2022.
Detection and Mitigation of Targeted Data Poisoning Attacks in Federated Learning. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :1—8.
Federated learning (FL) has emerged as a promising paradigm for distributed training of machine learning models. In FL, several participants train a global model collaboratively by only sharing model parameter updates while keeping their training data local. However, FL was recently shown to be vulnerable to data poisoning attacks, in which malicious participants send parameter updates derived from poisoned training data. In this paper, we focus on defending against targeted data poisoning attacks, where the attacker’s goal is to make the model misbehave for a small subset of classes while the rest of the model is relatively unaffected. To defend against such attacks, we first propose a method called MAPPS for separating malicious updates from benign ones. Using MAPPS, we propose three methods for attack detection: MAPPS + X-Means, MAPPS + VAT, and their Ensemble. Then, we propose an attack mitigation approach in which a "clean" model (i.e., a model that is not negatively impacted by an attack) can be trained despite the existence of a poisoning attempt. We empirically evaluate all of our methods using popular image classification datasets. Results show that we can achieve \textgreater 95% true positive rates while incurring only \textless 2% false positive rate. Furthermore, the clean models that are trained using our proposed methods have accuracy comparable to models trained in an attack-free scenario.
Jagadeesha, Nishchal.
2022.
Facial Privacy Preservation using FGSM and Universal Perturbation attacks. 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON). 1:46—52.
Research done in Facial Privacy so far has entrenched the scope of gleaning race, age, and gender from a human’s facial image that are classifiable and compliant biometric attributes. Noticeable distortions, morphing, and face-swapping are some of the techniques that have been researched to restore consumers’ privacy. By fooling face recognition models, these techniques cater superficially to the needs of user privacy, however, the presence of visible manipulations negatively affects the aesthetic of the image. The objective of this work is to highlight common adversarial techniques that can be used to introduce granular pixel distortions using white-box and black-box perturbation algorithms that ensure the privacy of users’ sensitive or personal data in face images, fooling AI facial recognition models while maintaining the aesthetics of and visual integrity of the image.