Visible to the public Efficiently Stealing your Machine Learning Models

TitleEfficiently Stealing your Machine Learning Models
Publication TypeConference Paper
Year of Publication2019
AuthorsReith, Robert Nikolai, Schneider, Thomas, Tkachenko, Oleksandr
Conference NameProceedings of the 18th ACM Workshop on Privacy in the Electronic Society
PublisherAssociation for Computing Machinery
Conference LocationLondon, United Kingdom
ISBN Number978-1-4503-6830-8
Keywordscomposability, ideal leakage, machine learning, model extraction, Predictive Metrics, pubcrawl, Resiliency, support vector machine, Support vector machines, support vector regression machine
AbstractMachine Learning as a Service (MLaaS) is a growing paradigm in the Machine Learning (ML) landscape. More and more ML models are being uploaded to the cloud and made accessible from all over the world. Creating good ML models, however, can be expensive and the used data is often sensitive. Recently, Secure Multi-Party Computation (SMPC) protocols for MLaaS have been proposed, which protect sensitive user data and ML models at the expense of substantially higher computation and communication than plaintext evaluation. In this paper, we show that for a subset of ML models used in MLaaS, namely Support Vector Machines (SVMs) and Support Vector Regression Machines (SVRs) which have found many applications to classifying multimedia data such as texts and images, it is possible for adversaries to passively extract the private models even if they are protected by SMPC, using known and newly devised model extraction attacks. We show that our attacks are not only theoretically possible but also practically feasible and cheap, which makes them lucrative to financially motivated attackers such as competitors or customers. We perform model extraction attacks on the homomorphic encryption-based protocol for privacy-preserving SVR-based indoor localization by Zhang et al. (International Workshop on Security 2016). We show that it is possible to extract a highly accurate model using only 854 queries with the estimated cost of \$0.09 on the Amazon ML platform, and our attack would take only 7 minutes over the Internet. Also, we perform our model extraction attacks on SVM and SVR models trained on publicly available state-of-the-art ML datasets.
URLhttps://doi.org/10.1145/3338498.3358646
DOI10.1145/3338498.3358646
Citation Keyreith_efficiently_2019