Attacks on Digital Watermarks for Deep Neural Networks

Submitted by grigby1 on Thu, 07/30/2020 - 1:54pm

Title	Attacks on Digital Watermarks for Deep Neural Networks
Publication Type	Conference Paper
Year of Publication	2019
Authors	Wang, Tianhao, Kerschbaum, Florian
Conference Name	ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date Published	may
Keywords	At-tack, composability, Computational modeling, copy protection, deep learning models, Deep Neural Network, deep neural networks training, detection algorithms, Digital Watermarks, fast response times., industrial property, intellectual property theft, ip protection, learning (artificial intelligence), Mathematical model, Mobile app, model prediction, neural nets, Neural networks, policy-based governance, pubcrawl, resilience, Resiliency, Standards, statistical distribution, Training, watermark, Watermarking
Abstract	Training deep neural networks is a computationally expensive task. Furthermore, models are often derived from proprietary datasets that have been carefully prepared and labelled. Hence, creators of deep learning models want to protect their models against intellectual property theft. However, this is not always possible, since the model may, e.g., be embedded in a mobile app for fast response times. As a countermeasure watermarks for deep neural networks have been developed that embed secret information into the model. This information can later be retrieved by the creator to prove ownership. Uchida et al. proposed the first such watermarking method. The advantage of their scheme is that it does not compromise the accuracy of the model prediction. However, in this paper we show that their technique modifies the statistical distribution of the model. Using this modification we can not only detect the presence of a watermark, but even derive its embedding length and use this information to remove the watermark by overwriting it. We show analytically that our detection algorithm follows consequentially from their embedding algorithm and propose a possible countermeasure. Our findings shall help to refine the definition of undetectability of watermarks for deep neural networks.
DOI	10.1109/ICASSP.2019.8682202
Citation Key	wang_attacks_2019

Groups:

Science of Security VO