KATE: K-Competitive Autoencoder for Text

Submitted by grigby1 on Tue, 02/06/2018 - 1:34pm

Title	KATE: K-Competitive Autoencoder for Text
Publication Type	Conference Paper
Year of Publication	2017
Authors	Chen, Yu, Zaki, Mohammed J.
Conference Name	Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Publisher	ACM
Conference Location	New York, NY, USA
ISBN Number	978-1-4503-4887-4
Keywords	autoencoders, competitive learning, composability, Human Behavior, human factors, Metrics, pubcrawl, representation learning, Scalability, text analytics
Abstract	Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confoundin properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval.
URL	https://dl.acm.org/citation.cfm?doid=3097983.3098017
DOI	10.1145/3097983.3098017
Citation Key	chen_kate:_2017

Groups:

Science of Security VO