Visible to the public Biblio

Filters: Keyword is MNIST dataset  [Clear All Filters]
2020-08-07
Torkzadehmahani, Reihaneh, Kairouz, Peter, Paten, Benedict.  2019.  DP-CGAN: Differentially Private Synthetic Data and Label Generation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :98—104.
Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, we introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy, which improves the performance of the model while preserving privacy of the training dataset. DP-CGAN generates both synthetic data and corresponding labels and leverages the recently introduced Renyi differential privacy accountant to track the spent privacy budget. The experimental results show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy.
2020-07-20
Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun.  2018.  Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.
Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.
2020-06-12
Min, Congwen, Li, Yi, Fang, Li, Chen, Ping.  2019.  Conditional Generative Adversarial Network on Semi-supervised Learning Task. 2019 IEEE 5th International Conference on Computer and Communications (ICCC). :1448—1452.

Semi-supervised learning has recently gained increasingly attention because it can combine abundant unlabeled data with carefully labeled data to train deep neural networks. However, common semi-supervised methods deeply rely on the quality of pseudo labels. In this paper, we proposed a new semi-supervised learning method based on Generative Adversarial Network (GAN), by using discriminator to learn the feature of both labeled and unlabeled data, instead of generating pseudo labels that cannot all be correct. Our approach, semi-supervised conditional GAN (SCGAN), builds upon the conditional GAN model, extending it to semi-supervised learning by changing the discriminator's output to a classification output and a real or false output. We evaluate our approach with basic semi-supervised model on MNIST dataset. It shows that our approach achieves the classification accuracy with 84.15%, outperforming the basic semi-supervised model with 72.94%, when labeled data are 1/600 of all data.

2020-04-20
Xiao, Tianrui, Khisti, Ashish.  2019.  Maximal Information Leakage based Privacy Preserving Data Disclosure Mechanisms. 2019 16th Canadian Workshop on Information Theory (CWIT). :1–6.
It is often necessary to disclose training data to the public domain, while protecting privacy of certain sensitive labels. We use information theoretic measures to develop such privacy preserving data disclosure mechanisms. Our mechanism involves perturbing the data vectors to strike a balance in the privacy-utility trade-off. We use maximal information leakage between the output data vector and the confidential label as our privacy metric. We first study the theoretical Bernoulli-Gaussian model and study the privacy-utility trade-off when only the mean of the Gaussian distributions can be perturbed. We show that the optimal solution is the same as the case when the utility is measured using probability of error at the adversary. We then consider an application of this framework to a data driven setting and provide an empirical approximation to the Sibson mutual information. By performing experiments on the MNIST and FERG data sets, we show that our proposed framework achieves equivalent or better privacy than previous methods based on mutual information.
2018-06-07
Marques, J., Andrade, J., Falcao, G..  2017.  Unreliable memory operation on a convolutional neural network processor. 2017 IEEE International Workshop on Signal Processing Systems (SiPS). :1–6.

The evolution of convolutional neural networks (CNNs) into more complex forms of organization, with additional layers, larger convolutions and increasing connections, established the state-of-the-art in terms of accuracy errors for detection and classification challenges in images. Moreover, as they evolved to a point where Gigabytes of memory are required for their operation, we have reached a stage where it becomes fundamental to understand how their inference capabilities can be impaired if data elements somehow become corrupted in memory. This paper introduces fault-injection in these systems by simulating failing bit-cells in hardware memories brought on by relaxing the 100% reliable operation assumption. We analyze the behavior of these networks calculating inference under severe fault-injection rates and apply fault mitigation strategies to improve on the CNNs resilience. For the MNIST dataset, we show that 8x less memory is required for the feature maps memory space, and that in sub-100% reliable operation, fault-injection rates up to 10-1 (with most significant bit protection) can withstand only a 1% error probability degradation. Furthermore, considering the offload of the feature maps memory to an embedded dynamic RAM (eDRAM) system, using technology nodes from 65 down to 28 nm, up to 73 80% improved power efficiency can be obtained.