Title | Enhanced Word Embedding Method in Text Classification |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Hu, Shengze, He, Chunhui, Ge, Bin, Liu, Fang |
Conference Name | 2020 6th International Conference on Big Data and Information Analytics (BigDIA) |
Keywords | Big Data, Classification algorithms, composability, Deep Learning, distributed word embedding, Human Behavior, human factors, Metrics, natural language processing, Neural networks, pubcrawl, Scalability, semantic similarity, Task Analysis, text analytics, text categorization, text classification, Training |
Abstract | For the task of natural language processing (NLP), Word embedding technology has a certain impact on the accuracy of deep neural network algorithms. Considering that the current word embedding method cannot realize the coexistence of words and phrases in the same vector space. Therefore, we propose an enhanced word embedding (EWE) method. Before completing the word embedding, this method introduces a unique sentence reorganization technology to rewrite all the sentences in the original training corpus. Then, all the original corpus and the reorganized corpus are merged together as the training corpus of the distributed word embedding model, so as to realize the coexistence problem of words and phrases in the same vector space. We carried out experiment to demonstrate the effectiveness of the EWE algorithm on three classic benchmark datasets. The results show that the EWE method can significantly improve the classification performance of the CNN model. |
DOI | 10.1109/BigDIA51454.2020.00012 |
Citation Key | hu_enhanced_2020 |