Method of Textual Information Authorship Analysis Based on Stylometry
Title | Method of Textual Information Authorship Analysis Based on Stylometry |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Vysotska, V., Lytvyn, V., Hrendus, M., Kubinska, S., Brodyak, O. |
Conference Name | 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT) |
ISBN Number | 978-1-5386-6464-3 |
Keywords | author publications, computational linguistics, Content analysis, Correlation, Dictionaries, formal approach, glottochronology, Human Behavior, Indexes, information retrieval, linguistic analysis, Linguistics, linguometry, Metrics, Monitoring, natural language processing, NLP methods, Porter stemmer, Porter stemming algorithm (Porter stemmer), pubcrawl, Radio frequency, reference text fragment, statistical linguistic analysis, stop words, stylometry, stylometry technologies usage, text analysis, text content monitoring, textual information authorship analysis, Ukrainian scientific texts |
Abstract | The paper dwells on the peculiarities of stylometry technologies usage to determine the style of the author publications. Statistical linguistic analysis of the author's text allows taking advantage of text content monitoring based on Porter stemmer and NLP methods to determine the set of stop words. The latter is used in the methods of stylometry to determine the ownership of the analyzed text to a specific author in percentage points. There is proposed a formal approach to the definition of the author's style of the Ukrainian text in the article. The experimental results of the proposed method for determining the ownership of the analyzed text to a particular author upon the availability of the reference text fragment are obtained. The study was conducted on the basis of the Ukrainian scientific texts of a technical area. |
URL | https://ieeexplore.ieee.org/document/8526608 |
DOI | 10.1109/STC-CSIT.2018.8526608 |
Citation Key | vysotska_method_2018 |
- natural language processing
- Ukrainian scientific texts
- textual information authorship analysis
- text content monitoring
- text analysis
- stylometry technologies usage
- stylometry
- stop words
- statistical linguistic analysis
- reference text fragment
- Radio frequency
- pubcrawl
- Porter stemming algorithm (Porter stemmer)
- Porter stemmer
- NLP methods
- author publications
- Monitoring
- Metrics
- linguometry
- Linguistics
- linguistic analysis
- information retrieval
- Indexes
- Human behavior
- glottochronology
- formal approach
- Dictionaries
- Correlation
- Content analysis
- computational linguistics