Latent Backdoor Attacks on Deep Neural Networks

Submitted by grigby1 on Mon, 01/27/2020 - 12:30pm

Title	Latent Backdoor Attacks on Deep Neural Networks
Publication Type	Conference Paper
Year of Publication	2019
Authors	Yao, Yuanshun, Li, Huiying, Zheng, Haitao, Zhao, Ben Y.
Conference Name	Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
Date Published	November 2019
Publisher	Association for Computing Machinery
Conference Location	London, United Kingdom
ISBN Number	978-1-4503-6747-9
Keywords	artificial intelligence, Human Behavior, human factors, machine learning, Metrics, Neural networks, pubcrawl, stylometry
Abstract	Recent work proposed the concept of backdoor attacks on deep neural networks (DNNs), where misclassification rules are hidden inside normal models, only to be triggered by very specific inputs. However, these "traditional" backdoors assume a context where users train their own models from scratch, which rarely occurs in practice. Instead, users typically customize "Teacher" models already pretrained by providers like Google, through a process called transfer learning. This customization process introduces significant changes to models and disrupts hidden backdoors, greatly reducing the actual impact of backdoors in practice. In this paper, we describe latent backdoors, a more powerful and stealthy variant of backdoor attacks that functions under transfer learning. Latent backdoors are incomplete backdoors embedded into a "Teacher" model, and automatically inherited by multiple "Student" models through transfer learning. If any Student models include the label targeted by the backdoor, then its customization process completes the backdoor and makes it active. We show that latent backdoors can be quite effective in a variety of application contexts, and validate its practicality through real-world attacks against traffic sign recognition, iris identification of volunteers, and facial recognition of public figures (politicians). Finally, we evaluate 4 potential defenses, and find that only one is effective in disrupting latent backdoors, but might incur a cost in classification accuracy as tradeoff.
URL	https://dl.acm.org/doi/10.1145/3319535.3354209
DOI	10.1145/3319535.3354209
Citation Key	yao_latent_2019

Groups:

Science of Security VO