Visible to the public Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory

TitleSequence Modeling with Hierarchical Deep Generative Models with Dual Memory
Publication TypeConference Paper
Year of Publication2017
AuthorsZheng, Yanan, Wen, Lijie, Wang, Jianmin, Yan, Jun, Ji, Lei
Conference NameProceedings of the 2017 ACM on Conference on Information and Knowledge Management
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4918-5
Keywordsdual memory mechanism, hierarchical deep generative models, Human Behavior, inference and learning, Metrics, pubcrawl, random key generation, resilience, Resiliency, Scalability, sequence modeling
Abstract

Deep Generative Models (DGMs) are able to extract high-level representations from massive unlabeled data and are explainable from a probabilistic perspective. Such characteristics favor sequence modeling tasks. However, it still remains a huge challenge to model sequences with DGMs. Unlike real-valued data that can be directly fed into models, sequence data consist of discrete elements and require being transformed into certain representations first. This leads to the following two challenges. First, high-level features are sensitive to small variations of inputs as well as the way of representing data. Second, the models are more likely to lose long-term information during multiple transformations. In this paper, we propose a Hierarchical Deep Generative Model With Dual Memory to address the two challenges. Furthermore, we provide a method to efficiently perform inference and learning on the model. The proposed model extends basic DGMs with an improved hierarchically organized multi-layer architecture. Besides, our model incorporates memories along dual directions, respectively denoted as broad memory and deep memory. The model is trained end-to-end by optimizing a variational lower bound on data log-likelihood using the improved stochastic variational method. We perform experiments on several tasks with various datasets and obtain excellent results. The results of language modeling show our method significantly outperforms state-of-the-art results in terms of generative performance. Extended experiments including document modeling and sentiment analysis, prove the high-effectiveness of dual memory mechanism and latent representations. Text random generation provides a straightforward perception for advantages of our model.

URLhttps://dl.acm.org/citation.cfm?doid=3132847.3132952
DOI10.1145/3132847.3132952
Citation Keyzheng_sequence_2017