Title | Research and Implementation of Data Extraction Method Based on NLP |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Si, Y., Zhou, W., Gai, J. |
Conference Name | 2020 IEEE 14th International Conference on Anti-counterfeiting, Security, and Identification (ASID) |
Keywords | Chinese Text, Conferences, data extraction, data extraction method, data mining, feature extraction, feature word lists, Human Behavior, information extraction, language expression rules, natural language processing, NLP, pubcrawl, regular expression, Resiliency, rule template, rule-based method, Scalability, security, text analysis, unstructured Chinese text |
Abstract | In order to accurately extract the data from unstructured Chinese text, this paper proposes a rule-based method based on natural language processing and regular expression. This method makes use of the language expression rules of the data in the text and other related knowledge to form the feature word lists and rule template to match the text. Experimental results show that the accuracy of the designed algorithm is 94.09%. |
DOI | 10.1109/ASID50160.2020.9271745 |
Citation Key | si_research_2020 |