Visible to the public Latent Backdoor Attacks on Deep Neural Networks

TitleLatent Backdoor Attacks on Deep Neural Networks
Publication TypeConference Paper
Year of Publication2019
AuthorsYao, Yuanshun, Li, Huiying, Zheng, Haitao, Zhao, Ben Y.
Conference NameProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
Date PublishedNovember 2019
PublisherAssociation for Computing Machinery
Conference LocationLondon, United Kingdom
ISBN Number978-1-4503-6747-9
Keywordsartificial intelligence, Human Behavior, human factors, machine learning, Metrics, Neural networks, pubcrawl, stylometry
Abstract

Recent work proposed the concept of backdoor attacks on deep neural networks (DNNs), where misclassification rules are hidden inside normal models, only to be triggered by very specific inputs. However, these "traditional" backdoors assume a context where users train their own models from scratch, which rarely occurs in practice. Instead, users typically customize "Teacher" models already pretrained by providers like Google, through a process called transfer learning. This customization process introduces significant changes to models and disrupts hidden backdoors, greatly reducing the actual impact of backdoors in practice. In this paper, we describe latent backdoors, a more powerful and stealthy variant of backdoor attacks that functions under transfer learning. Latent backdoors are incomplete backdoors embedded into a "Teacher" model, and automatically inherited by multiple "Student" models through transfer learning. If any Student models include the label targeted by the backdoor, then its customization process completes the backdoor and makes it active. We show that latent backdoors can be quite effective in a variety of application contexts, and validate its practicality through real-world attacks against traffic sign recognition, iris identification of volunteers, and facial recognition of public figures (politicians). Finally, we evaluate 4 potential defenses, and find that only one is effective in disrupting latent backdoors, but might incur a cost in classification accuracy as tradeoff.

URLhttps://dl.acm.org/doi/10.1145/3319535.3354209
DOI10.1145/3319535.3354209
Citation Keyyao_latent_2019