Title | Improving Text Classification Using Knowledge in Labels |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Zhang, Cheng, Yamana, Hayato |
Conference Name | 2021 IEEE 6th International Conference on Big Data Analytics (ICBDA) |
Keywords | BERT, Big Data, Bit error rate, composability, Conferences, Deep Learning, encoding, Human Behavior, Metrics, natural language processing, Natural languages, Numerical models, pubcrawl, Scalability, text analytics, text categorization, text classification, text mining |
Abstract | Various algorithms and models have been proposed to address text classification tasks; however, they rarely consider incorporating the additional knowledge hidden in class labels. We argue that hidden information in class labels leads to better classification accuracy. In this study, instead of encoding the labels into numerical values, we incorporated the knowledge in the labels into the original model without changing the model architecture. We combined the output of an original classification model with the relatedness calculated based on the embeddings of a sequence and a keyword set. A keyword set is a word set to represent knowledge in the labels. Usually, it is generated from the classes while it could also be customized by the users. The experimental results show that our proposed method achieved statistically significant improvements in text classification tasks. The source code and experimental details of this study can be found on Github11https://github.com/HeroadZ/KiL. |
DOI | 10.1109/ICBDA51983.2021.9403092 |
Citation Key | zhang_improving_2021 |