Biblio
The vast majority of theoretical results in machine learning and statistics assume that the training data is a reliable reflection of the phenomena to be learned. Similarly, most learning techniques used in practice are brittle to the presence of large amounts of biased or malicious data. Motivated by this, we consider two frameworks for studying estimation, learning, and optimization in the presence of significant fractions of arbitrary data. The first framework, list-decodable learning, asks whether it is possible to return a list of answers such that at least one is accurate. For example, given a dataset of n points for which an unknown subset of $\alpha$n points are drawn from a distribution of interest, and no assumptions are made about the remaining (1 - $\alpha$)n points, is it possible to return a list of poly(1/$\alpha$) answers? The second framework, which we term the semi-verified model, asks whether a small dataset of trusted data (drawn from the distribution in question) can be used to extract accurate information from a much larger but untrusted dataset (of which only an $\alpha$-fraction is drawn from the distribution). We show strong positive results in both settings, and provide an algorithm for robust learning in a very general stochastic optimization setting. This result has immediate implications for robustly estimating the mean of distributions with bounded second moments, robustly learning mixtures of such distributions, and robustly finding planted partitions in random graphs in which significant portions of the graph have been perturbed by an adversary.
Network traffic classification is an important problem in network traffic analysis. It plays a vital role in many network tasks including quality of service, firewall enforcement and security. One of the challenging problems of classifying network traffic is the imbalanced property of network data. Usually, the amount of traffic in some classes is much higher than the amount of traffic in other classes. In this paper, we proposed an application of a deep learning approach to address imbalanced data problem in network traffic classification. We used a recent proposed deep network for unsupervised learning called Auxiliary Classifier Generative Adversarial Network to generate synthesized data samples for balancing between the minor and the major classes. We tested our method on a well-known network traffic dataset and the results showed that our proposed method achieved better performance compared to a recent proposed method for handling imbalanced problem in network traffic classification.
Hashing has been a widely-adopted technique for nearest neighbor search in large-scale image retrieval tasks. Recent research has shown that leveraging supervised information can lead to high quality hashing. However, the cost of annotating data is often an obstacle when applying supervised hashing to a new domain. Moreover, the results can suffer from the robustness problem as the data at training and test stage may come from different distributions. This paper studies the exploration of generating synthetic data through semi-supervised generative adversarial networks (GANs), which leverages largely unlabeled and limited labeled training data to produce highly compelling data with intrinsic invariance and global coherence, for better understanding statistical structures of natural data. We demonstrate that the above two limitations can be well mitigated by applying the synthetic data for hashing. Specifically, a novel deep semantic hashing with GANs (DSH-GANs) is presented, which mainly consists of four components: a deep convolution neural networks (CNN) for learning image representations, an adversary stream to distinguish synthetic images from real ones, a hash stream for encoding image representations to hash codes and a classification stream. The whole architecture is trained end-to-end by jointly optimizing three losses, i.e., adversarial loss to correct label of synthetic or real for each sample, triplet ranking loss to preserve the relative similarity ordering in the input real-synthetic triplets and classification loss to classify each sample accurately. Extensive experiments conducted on both CIFAR-10 and NUS-WIDE image benchmarks validate the capability of exploiting synthetic images for hashing. Our framework also achieves superior results when compared to state-of-the-art deep hash models.
Large-scale mobile traffic analytics is becoming essential to digital infrastructure provisioning, public transportation, events planning, and other domains. Monitoring city-wide mobile traffic is however a complex and costly process that relies on dedicated probes. Some of these probes have limited precision or coverage, others gather tens of gigabytes of logs daily, which independently offer limited insights. Extracting fine-grained patterns involves expensive spatial aggregation of measurements, storage, and post-processing. In this paper, we propose a mobile traffic super-resolution technique that overcomes these problems by inferring narrowly localised traffic consumption from coarse measurements. We draw inspiration from image processing and design a deep-learning architecture tailored to mobile networking, which combines Zipper Network (ZipNet) and Generative Adversarial neural Network (GAN) models. This enables to uniquely capture spatio-temporal relations between traffic volume snapshots routinely monitored over broad coverage areas ('low-resolution') and the corresponding consumption at 0.05 km2 level ('high-resolution') usually obtained after intensive computation. Experiments we conduct with a real-world data set demonstrate that the proposed ZipNet(-GAN) infers traffic consumption with remarkable accuracy and up to 100X higher granularity as compared to standard probing, while outperforming existing data interpolation techniques. To our knowledge, this is the first time super-resolution concepts are applied to large-scale mobile traffic analysis and our solution is the first to infer fine-grained urban traffic patterns from coarse aggregates.
Cross-modal audio-visual perception has been a long-lasting topic in psychology and neurology, and various studies have discovered strong correlations in human perception of auditory and visual stimuli. Despite work on computational multimodal modeling, the problem of cross-modal audio-visual generation has not been systematically studied in the literature. In this paper, we make the first attempt to solve this cross-modal generation problem leveraging the power of deep generative adversarial training. Specifically, we use conditional generative adversarial networks to achieve cross-modal audio-visual generation of musical performances. We explore different encoding methods for audio and visual signals, and work on two scenarios: instrument-oriented generation and pose-oriented generation. Being the first to explore this new problem, we compose two new datasets with pairs of images and sounds of musical performances of different instruments. Our experiments using both classification and human evaluation demonstrate that our model has the ability to generate one modality, i.e., audio/visual, from the other modality, i.e., visual/audio, to a good extent. Our experiments on various design choices along with the datasets will facilitate future research in this new problem space.
The paper presents a fully automatic end-to-end trainable system to colorize grayscale images. Colorization is a highly under-constrained problem. In order to produce realistic outputs, the proposed approach takes advantage of the recent advances in deep learning and generative networks. To achieve plausible colorization, the paper investigates conditional Wasserstein Generative Adversarial Networks (WGAN) [3] as a solution to this problem. Additionally, a loss function consisting of two classification loss components apart from the adversarial loss learned by the WGAN is proposed. The first classification loss provides a measure of how much the predicted colored images differ from ground truth. The second classification loss component makes use of ground truth semantic classification labels in order to learn meaningful intermediate features. Finally, WGAN training procedure pushes the predictions to the manifold of natural images. The system is validated using a user study and a semantic interpretability test and achieves results comparable to [1] on Imagenet dataset [10].
Existing methods of generative adversarial network (GAN) use different criteria to distinguish between real and fake samples, such as probability [9],energy [44] energy or other losses [30]. In this paper, by employing the merits of deep metric learning, we propose a novel metric-based generative adversarial network (MBGAN), which uses the distance-criteria to distinguish between real and fake samples. Specifically, the discriminator of MBGAN adopts a triplet structure and learns a deep nonlinear transformation, which maps input samples into a new feature space. In the transformed space, the distance between real samples is minimized, while the distance between real sample and fake sample is maximized. Similar to the adversarial procedure of existing GANs, a generator is trained to produce synthesized examples, which are close to real examples, while a discriminator is trained to maximize the distance between real and fake samples to a large margin. Meanwhile, instead of using a fixed margin, we adopt a data-dependent margin [30], so that the generator could focus on improving the synthesized samples with poor quality, instead of wasting energy on well-produce samples. Our proposed method is verified on various benchmarks, such as CIFAR-10, SVHN and CelebA, and generates high-quality samples.
Missing value is common in many machine learning problems and much effort has been made to handle missing data to improve the performance of the learned model. Sometimes, our task is not to train a model using those unlabeled/labeled data with missing value but process examples according to the values of some specified features. So, there is an urgent need of developing a method to predict those missing values. In this paper, we focus on learning from the known values to learn missing value as close as possible to the true one. It's difficult for us to predict missing value because we do not know the structure of the data matrix and some missing values may relate to some other missing values. We solve the problem by recovering the complete data matrix under the three reasonable constraints: feature relationship, upper recovery error bound and class relationship. The proposed algorithm can deal with both unlabeled and labeled data and generative adversarial idea will be used in labeled data to transfer knowledge. Extensive experiments have been conducted to show the effectiveness of the proposed algorithms.
In this paper, we propose an autoencoder-based generative adversarial network (GAN) for automatic image generation, which is called "stylized adversarial autoencoder". Different from existing generative autoencoders which typically impose a prior distribution over the latent vector, the proposed approach splits the latent variable into two components: style feature and content feature, both encoded from real images. The split of the latent vector enables us adjusting the content and the style of the generated image arbitrarily by choosing different exemplary images. In addition, a multiclass classifier is adopted in the GAN network as the discriminator, which makes the generated images more realistic. We performed experiments on hand-writing digits, scene text and face datasets, in which the stylized adversarial autoencoder achieves superior results for image generation as well as remarkably improves the corresponding supervised recognition task.
Distractor generation is a crucial step for fill-in-the-blank question generation. We propose a generative model learned from training generative adversarial nets (GANs) to create useful distractors. Our method utilizes only context information and does not use the correct answer, which is completely different from previous Ontology-based or similarity-based approaches. Trained on the Wikipedia corpus, the proposed model is able to predict Wiki entities as distractors. Our method is evaluated on two biology question datasets collected from Wikipedia and actual college-level exams. Experimental results show that our context-based method achieves comparable performance to a frequently used word2vec-based method for the Wiki dataset. In addition, we propose a second-stage learner to combine the strengths of the two methods, which further improves the performance on both datasets, with 51.7% and 48.4% of generated distractors being acceptable.
- « first
- ‹ previous
- 1
- 2
- 3