Visible to the public A Simple Data Augmentation Method to Improve the Performance of Named Entity Recognition Models in Medical Domain

TitleA Simple Data Augmentation Method to Improve the Performance of Named Entity Recognition Models in Medical Domain
Publication TypeConference Paper
Year of Publication2021
AuthorsIssifu, Abdul Majeed, Ganiz, Murat Can
Conference Name2021 6th International Conference on Computer Science and Engineering (UBMK)
KeywordsAdaptation models, annotations, BERT, BioBERT, Computer science, data augmentation, data deletion, Deep Learning, medical data, NER, privacy, pubcrawl, Scalability, Terminology, text categorization, Text recognition
AbstractEasy Data Augmentation is originally developed for text classification tasks. It consists of four basic methods: Synonym Replacement, Random Insertion, Random Deletion, and Random Swap. They yield accuracy improvements on several deep neural network models. In this study we apply these methods to a new domain. We augment Named Entity Recognition datasets from medical domain. Although the augmentation task is much more difficult due to the nature of named entities which consist of word or word groups in the sentences, we show that we can improve the named entity recognition performance.
DOI10.1109/UBMK52708.2021.9558986
Citation Keyissifu_simple_2021