Title | A Simple Data Augmentation Method to Improve the Performance of Named Entity Recognition Models in Medical Domain |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Issifu, Abdul Majeed, Ganiz, Murat Can |
Conference Name | 2021 6th International Conference on Computer Science and Engineering (UBMK) |
Keywords | Adaptation models, annotations, BERT, BioBERT, Computer science, data augmentation, data deletion, Deep Learning, medical data, NER, privacy, pubcrawl, Scalability, Terminology, text categorization, Text recognition |
Abstract | Easy Data Augmentation is originally developed for text classification tasks. It consists of four basic methods: Synonym Replacement, Random Insertion, Random Deletion, and Random Swap. They yield accuracy improvements on several deep neural network models. In this study we apply these methods to a new domain. We augment Named Entity Recognition datasets from medical domain. Although the augmentation task is much more difficult due to the nature of named entities which consist of word or word groups in the sentences, we show that we can improve the named entity recognition performance. |
DOI | 10.1109/UBMK52708.2021.9558986 |
Citation Key | issifu_simple_2021 |