Biblio

List
Filter

Found 5 results

Filters: Author is Liu, Sijia [Clear All Filters]

2023-01-13

Taneja, Vardaan, Chen, Pin-Yu, Yao, Yuguang, Liu, Sijia. 2022. When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :4398—4402.

Recent studies have demonstrated the lack of robustness of image reconstruction networks to test-time evasion attacks, posing security risks and potential for misdiagnoses. In this paper, we evaluate how vulnerable such networks are to training-time poisoning attacks for the first time. In contrast to image classification, we find that trigger-embedded basic backdoor attacks on these models executed using heuristics lead to poor attack performance. Thus, it is non-trivial to generate backdoor attacks for image reconstruction. To tackle the problem, we propose a bi-level optimization (BLO)-based attack generation method and investigate its effectiveness on image reconstruction. We show that BLO-generated back-door attacks can yield a significant improvement over the heuristics-based attack strategy.

2023-01-06

Chen, Tianlong, Zhang, Zhenyu, Zhang, Yihua, Chang, Shiyu, Liu, Sijia, Wang, Zhangyang. 2022. Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :588—599.

Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger. Several works attempt to detect whether a given DNN has been injected with a specific trigger during the training. In a parallel line of research, the lottery ticket hypothesis reveals the existence of sparse sub-networks which are capable of reaching competitive performance as the dense network after independent training. Connecting these two dots, we investigate the problem of Trojan DNN detection from the brand new lens of sparsity, even when no clean training data is available. Our crucial observation is that the Trojan features are significantly more stable to network pruning than benign features. Leveraging that, we propose a novel Trojan network detection regime: first locating a “winning Trojan lottery ticket” which preserves nearly full Trojan information yet only chance-level performance on clean inputs; then recovering the trigger embedded in this already isolated sub-network. Extensive experiments on various datasets, i.e., CIFAR-10, CIFAR-100, and ImageNet, with different network architectures, i.e., VGG-16, ResNet-18, ResNet-20s, and DenseNet-100 demonstrate the effectiveness of our proposal. Codes are available at https://github.com/VITA-Group/Backdoor-LTH.

2021-06-24

Tsaknakis, Ioannis, Hong, Mingyi, Liu, Sijia. 2020. Decentralized Min-Max Optimization: Formulations, Algorithms and Applications in Network Poisoning Attack. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :5755–5759.

This paper discusses formulations and algorithms which allow a number of agents to collectively solve problems involving both (non-convex) minimization and (concave) maximization operations. These problems have a number of interesting applications in information processing and machine learning, and in particular can be used to model an adversary learning problem called network data poisoning. We develop a number of algorithms to efficiently solve these non-convex min-max optimization problems, by combining techniques such as gradient tracking in the decentralized optimization literature and gradient descent-ascent schemes in the min-max optimization literature. Also, we establish convergence to a first order stationary point under certain conditions. Finally, we perform experiments to demonstrate that the proposed algorithms are effective in the data poisoning attack.

2020-09-04

Zhao, Pu, Liu, Sijia, Chen, Pin-Yu, Hoang, Nghia, Xu, Kaidi, Kailkhura, Bhavya, Lin, Xue. 2019. On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :121—130.

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.

2019-02-08

Zhao, Pu, Liu, Sijia, Wang, Yanzhi, Lin, Xue. 2018. An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks. Proceedings of the 26th ACM International Conference on Multimedia. :1065-1073.

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. In a successful adversarial attack, the targeted mis-classification should be achieved with the minimal distortion added. In the literature, the added distortions are usually measured by \$L\_0\$, \$L\_1\$, \$L\_2\$, and \$L\_$\backslash$infty \$ norms, namely, L\_0, L\_1, L\_2, and L\_$ınfty$ attacks, respectively. However, there lacks a versatile framework for all types of adversarial attacks. This work for the first time unifies the methods of generating adversarial examples by leveraging ADMM (Alternating Direction Method of Multipliers), an operator splitting optimization approach, such that \$L\_0\$, \$L\_1\$, \$L\_2\$, and \$L\_$\backslash$infty \$ attacks can be effectively implemented by this general framework with little modifications. Comparing with the state-of-the-art attacks in each category, our ADMM-based attacks are so far the strongest, achieving both the 100% attack success rate and the minimal distortion.