Title | Extractive Persian Summarizer for News Websites |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Kermani, Fatemeh Hojati, Ghanbari, Shirin |
Conference Name | 2019 5th International Conference on Web Research (ICWR) |
Date Published | apr |
Keywords | automatic extractive text summarization, data mining, English words, extractive Persian summarizer, extractive text summarization, feature extraction, feature vector, genetic algorithms, heuristic methods, heuristical and semantical, Human Behavior, Libraries, natural language processing, Persian news articles, pre-processing, pubcrawl, Resiliency, salient features, salient sentences, Scalability, semantic methods, Semantics, sentence length, statistical, statistical methods, text analysis, textual information, Tools, Web sites |
Abstract | Automatic extractive text summarization is the process of condensing textual information while preserving the important concepts. The proposed method after performing pre-processing on input Persian news articles generates a feature vector of salient sentences from a combination of statistical, semantic and heuristic methods and that are scored and concatenated accordingly. The scoring of the salient features is based on the article's title, proper nouns, pronouns, sentence length, keywords, topic words, sentence position, English words, and quotations. Experimental results on measurements including recall, F-measure, ROUGE-N are presented and compared to other Persian summarizers and shown to provide higher performance. |
DOI | 10.1109/ICWR.2019.8765279 |
Citation Key | kermani_extractive_2019 |