Fooling End-To-End Speaker Verification With Adversarial Examples

Submitted by aekwall on Wed, 01/16/2019 - 2:10pm

Title	Fooling End-To-End Speaker Verification With Adversarial Examples
Publication Type	Conference Paper
Year of Publication	2018
Authors	Kreuk, F., Adi, Y., Cisse, M., Keshet, J.
Conference Name	2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Keywords	adversarial examples, Automatic speaker verification, automatic speaker verification systems, black-box attacks, composability, deep end-to-end network, end-to-end deep neural models, fooling end-to-end speaker verification, Mel frequency cepstral coefficient, Metrics, MFCC, neural nets, Neural networks, NTIMIT, original speaker examples, Perturbation methods, pubcrawl, resilience, security of data, speaker recognition, Standards, Task Analysis, Training, White Box Security, YOHO
Abstract	Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.
DOI	10.1109/ICASSP.2018.8462693
Citation Key	kreuk_fooling_2018

Groups:

Science of Security VO