Title | Noise Reduction Framework for Distantly Supervised Relation Extraction with Human in the Loop |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Zhang, Xinyuan, Liu, Hongzhi, Wu, Zhonghai |
Conference Name | 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC) |
Keywords | data mining, Data models, distant supervision, feature extraction, human factors, human in the loop, noise reduction, pubcrawl, relation extraction, Semantics, Training, Training data |
Abstract | Distant supervision is a widely used data labeling method for relation extraction. While aligning knowledge base with the corpus, distant supervision leads to a mass of wrong labels which are defined as noise. The pattern-based denoising model has achieved great progress in selecting trustable sentences (instances). However, the writing of relation-specific patterns heavily relies on expert's knowledge and is a high labor intensity work. To solve these problems, we propose a noise reduction framework, NOIR, to iteratively select trustable sentences with a little help of a human. Under the guidance of experts, the iterative process can avoid semantic drift. Besides, NOIR can help experts discover relation-specific tokens that are hard to think of. Experimental results on three real-world datasets show the effectiveness of the proposed method compared with state-of-the-art methods. |
DOI | 10.1109/ICEIEC49280.2020.9152287 |
Citation Key | zhang_noise_2020 |