Biblio
Filters: Keyword is target captions [Clear All Filters]
Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
.
2019. In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.