Biblio
Filters: Keyword is model extraction [Clear All Filters]
DeepSteal: Advanced Model Extractions Leveraging Efficient Weight Stealing in Memories. 2022 IEEE Symposium on Security and Privacy (SP). :1157–1174.
.
2022. Recent advancements in Deep Neural Networks (DNNs) have enabled widespread deployment in multiple security-sensitive domains. The need for resource-intensive training and the use of valuable domain-specific training data have made these models the top intellectual property (IP) for model owners. One of the major threats to DNN privacy is model extraction attacks where adversaries attempt to steal sensitive information in DNN models. In this work, we propose an advanced model extraction framework DeepSteal that steals DNN weights remotely for the first time with the aid of a memory side-channel attack. Our proposed DeepSteal comprises two key stages. Firstly, we develop a new weight bit information extraction method, called HammerLeak, through adopting the rowhammer-based fault technique as the information leakage vector. HammerLeak leverages several novel system-level techniques tailored for DNN applications to enable fast and efficient weight stealing. Secondly, we propose a novel substitute model training algorithm with Mean Clustering weight penalty, which leverages the partial leaked bit information effectively and generates a substitute prototype of the target victim model. We evaluate the proposed model extraction framework on three popular image datasets (e.g., CIFAR-10/100/GTSRB) and four DNN architectures (e.g., ResNet-18/34/Wide-ResNetNGG-11). The extracted substitute model has successfully achieved more than 90% test accuracy on deep residual networks for the CIFAR-10 dataset. Moreover, our extracted substitute model could also generate effective adversarial input samples to fool the victim model. Notably, it achieves similar performance (i.e., 1-2% test accuracy under attack) as white-box adversarial input attack (e.g., PGD/Trades).
ISSN: 2375-1207
PRADA: Protecting Against DNN Model Stealing Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :512–527.
.
2019. Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44 percentage points, pp), and prediction accuracy (up to +46 pp) on two datasets. We provide take-aways on how to perform effective model extraction attacks. We then propose PRADA, the first step towards generic and effective detection of DNN model extraction attacks. It analyzes the distribution of consecutive API queries and raises an alarm when this distribution deviates from benign behavior. We show that PRADA can detect all prior model extraction attacks with no false positives.
Efficiently Stealing your Machine Learning Models. Proceedings of the 18th ACM Workshop on Privacy in the Electronic Society. :198–210.
.
2019. Machine Learning as a Service (MLaaS) is a growing paradigm in the Machine Learning (ML) landscape. More and more ML models are being uploaded to the cloud and made accessible from all over the world. Creating good ML models, however, can be expensive and the used data is often sensitive. Recently, Secure Multi-Party Computation (SMPC) protocols for MLaaS have been proposed, which protect sensitive user data and ML models at the expense of substantially higher computation and communication than plaintext evaluation. In this paper, we show that for a subset of ML models used in MLaaS, namely Support Vector Machines (SVMs) and Support Vector Regression Machines (SVRs) which have found many applications to classifying multimedia data such as texts and images, it is possible for adversaries to passively extract the private models even if they are protected by SMPC, using known and newly devised model extraction attacks. We show that our attacks are not only theoretically possible but also practically feasible and cheap, which makes them lucrative to financially motivated attackers such as competitors or customers. We perform model extraction attacks on the homomorphic encryption-based protocol for privacy-preserving SVR-based indoor localization by Zhang et al. (International Workshop on Security 2016). We show that it is possible to extract a highly accurate model using only 854 queries with the estimated cost of \$0.09 on the Amazon ML platform, and our attack would take only 7 minutes over the Internet. Also, we perform our model extraction attacks on SVM and SVR models trained on publicly available state-of-the-art ML datasets.