Visible to the public Enhanced word embedding with multiple prototypes

TitleEnhanced word embedding with multiple prototypes
Publication TypeConference Paper
Year of Publication2017
AuthorsZheng, Y., Shi, Y., Guo, K., Li, W., Zhu, L.
Conference Name2017 4th International Conference on Industrial Economics System and Industrial Security Engineering (IEIS)
ISBN Number978-1-5386-0995-8
KeywordsArtificial neural networks, basic word repressentation methods, Biological system modeling, Computational modeling, Context modeling, dense real-valued vector space, distributed word representation, enhanced word embedding, Human Behavior, language models, MCBOW, multiple prototypes, natural language processing, NLP, Prototypes, pubcrawl, Resiliency, Scalability, similar context, similar meanings, vector space, word embedding, word embeddings learning, word representation, word similarity evaluation task, word unit
Abstract

Word representation is one of the basic word repressentation methods in natural language processing, which mapped a word into a dense real-valued vector space based on a hypothesis: words with similar context have similar meanings. Models like NNLM, C&W, CBOW, Skip-gram have been designed for word embeddings learning, and get widely used in many NLP tasks. However, these models assume that one word had only one semantics meaning which is contrary to the real language rules. In this paper we pro-pose a new word unit with multiple meanings and an algorithm to distinguish them by it's context. This new unit can be embedded in most language models and get series of efficient representations by learning variable embeddings. We evaluate a new model MCBOW that integrate CBOW with our word unit on word similarity evaluation task and some downstream experiments, the result indicated our new model can learn different meanings of a word and get a better result on some other tasks.

URLhttp://ieeexplore.ieee.org/document/8078651/
DOI10.1109/IEIS.2017.8078651
Citation Keyzheng_enhanced_2017