Biblio
Word representation is one of the basic word repressentation methods in natural language processing, which mapped a word into a dense real-valued vector space based on a hypothesis: words with similar context have similar meanings. Models like NNLM, C&W, CBOW, Skip-gram have been designed for word embeddings learning, and get widely used in many NLP tasks. However, these models assume that one word had only one semantics meaning which is contrary to the real language rules. In this paper we pro-pose a new word unit with multiple meanings and an algorithm to distinguish them by it's context. This new unit can be embedded in most language models and get series of efficient representations by learning variable embeddings. We evaluate a new model MCBOW that integrate CBOW with our word unit on word similarity evaluation task and some downstream experiments, the result indicated our new model can learn different meanings of a word and get a better result on some other tasks.