Visible to the public Biblio

Filters: Keyword is stylometry  [Clear All Filters]
2023-02-03
Oldal, Laura Gulyás, Kertész, Gábor.  2022.  Evaluation of Deep Learning-based Authorship Attribution Methods on Hungarian Texts. 2022 IEEE 10th Jubilee International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC). :000161–000166.
The range of text analysis methods in the field of natural language processing (NLP) has become more and more extensive thanks to the increasing computational resources of the 21st century. As a result, many deep learning-based solutions have been proposed for the purpose of authorship attribution, as they offer more flexibility and automated feature extraction compared to traditional statistical methods. A number of solutions have appeared for the attribution of English texts, however, the number of methods designed for Hungarian language is extremely small. Hungarian is a morphologically rich language, sentence formation is flexible and the alphabet is different from other languages. Furthermore, a language specific POS tagger, pretrained word embeddings, dependency parser, etc. are required. As a result, methods designed for other languages cannot be directly applied on Hungarian texts. In this paper, we review deep learning-based authorship attribution methods for English texts and offer techniques for the adaptation of these solutions to Hungarian language. As a part of the paper, we collected a new dataset consisting of Hungarian literary works of 15 authors. In addition, we extensively evaluate the implemented methods on the new dataset.
Nelson, Jared Ray, Shekaramiz, Mohammad.  2022.  Authorship Verification via Linear Correlation Methods of n-gram and Syntax Metrics. 2022 Intermountain Engineering, Technology and Computing (IETC). :1–6.
This research evaluates the accuracy of two methods of authorship prediction: syntactical analysis and n-gram, and explores its potential usage. The proposed algorithm measures n-gram, and counts adjectives, adverbs, verbs, nouns, punctuation, and sentence length from the training data, and normalizes each metric. The proposed algorithm compares the metrics of training samples to testing samples and predicts authorship based on the correlation they share for each metric. The severity of correlation between the testing and training data produces significant weight in the decision-making process. For example, if analysis of one metric approximates 100% positive correlation, the weight in the decision is assigned a maximum value for that metric. Conversely, a 100% negative correlation receives the minimum value. This new method of authorship validation holds promise for future innovation in fraud protection, the study of historical documents, and maintaining integrity within academia.
Ouamour, S., Sayoud, H..  2022.  Computational Identification of Author Style on Electronic Libraries - Case of Lexical Features. 2022 5th International Symposium on Informatics and its Applications (ISIA). :1–4.
In the present work, we intend to present a thorough study developed on a digital library, called HAT corpus, for a purpose of authorship attribution. Thus, a dataset of 300 documents that are written by 100 different authors, was extracted from the web digital library and processed for a task of author style analysis. All the documents are related to the travel topic and written in Arabic. Basically, three important rules in stylometry should be respected: the minimum document size, the same topic for all documents and the same genre too. In this work, we made a particular effort to respect those conditions seriously during the corpus preparation. That is, three lexical features: Fixed-length words, Rare words and Suffixes are used and evaluated by using a centroid based Manhattan distance. The used identification approach shows interesting results with an accuracy of about 0.94.
2022-09-09
Raafat, Maryam A., El-Wakil, Rania Abdel-Fattah, Atia, Ayman.  2021.  Comparative study for Stylometric analysis techniques for authorship attribution. 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC). :176—181.
A text is a meaningful source of information. Capturing the right patterns in written text gives metrics to measure and infer to what extent this text belongs or is relevant to a specific author. This research aims to introduce a new feature that goes more in deep in the language structure. The feature introduced is based on an attempt to differentiate stylistic changes among authors according to the different sentence structure each author uses. The study showed the effect of introducing this new feature to machine learning models to enhance their performance. It was found that the prediction of authors was enhanced by adding sentence structure as an additional feature as the f1\_scores increased by 0.3% and when normalizing the data and adding the feature it increased by 5%.
Gonçalves, Luís, Vimieiro, Renato.  2021.  Approaching authorship attribution as a multi-view supervised learning task. 2021 International Joint Conference on Neural Networks (IJCNN). :1—8.
Authorship attribution is the problem of identifying the author of texts based on the author's writing style. It is usually assumed that the writing style contains traits inaccessible to conscious manipulation and can thus be safely used to identify the author of a text. Several style markers have been proposed in the literature, nevertheless, there is still no consensus on which best represent the choices of authors. Here we assume an agnostic viewpoint on the dispute for the best set of features that represents an author's writing style. We rather investigate how different sources of information may unveil different aspects of an author's style, complementing each other to improve the overall process of authorship attribution. For this we model authorship attribution as a multi-view learning task. We assess the effectiveness of our proposal applying it to a set of well-studied corpora. We compare the performance of our proposal to the state-of-the-art approaches for authorship attribution. We thoroughly analyze how the multi-view approach improves on methods that use a single data source. We confirm that our approach improves both in accuracy and consistency of the methods and discuss how these improvements are beneficial for linguists and domain specialists.
Saini, Anu, Sri, Manepalli Ratna, Thakur, Mansi.  2021.  Intrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN. 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS). :13—18.
Plagiarism is the act of using someone else’s words or ideas without giving them due credit and representing it as one’s own work. In today's world, it is very easy to plagiarize others' work due to advancement in technology, especially by the use of the Internet or other offline sources such as books or magazines. Plagiarism can be classified into two broad categories on the basis of detection namely extrinsic and intrinsic plagiarism. Extrinsic plagiarism detection refers to detecting plagiarism in a document by comparing it against a given reference dataset, whereas, Intrinsic plagiarism detection refers to detecting plagiarism with the help of variation in writing styles without using any reference corpus. Although there are many approaches which can be adopted to detect extrinsic plagiarism, few are available for intrinsic plagiarism detection. In this paper, a simplified approach is proposed for developing an intrinsic plagiarism detector which is helpful in detecting plagiarism even when no reference corpus is available. The approach deals with development of an intrinsic plagiarism detection system by identifying the writing style of authors in the document using stylometric features and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering. The proposed system has an easy to use interactive interface where user has to upload a text document to be checked for plagiarism and the result is displayed on the web page itself. In addition, the user can also see the analysis of the document in the form of graphs.
Cardaioli, Matteo, Conti, Mauro, Sorbo, Andrea Di, Fabrizio, Enrico, Laudanna, Sonia, Visaggio, Corrado A..  2021.  It’s a Matter of Style: Detecting Social Bots through Writing Style Consistency. 2021 International Conference on Computer Communications and Networks (ICCCN). :1—9.
Social bots are computer algorithms able to produce content and interact with other users on social media autonomously, trying to emulate and possibly influence humans’ behavior. Indeed, bots are largely employed for malicious purposes, like spreading disinformation and conditioning electoral campaigns. Nowadays, bots’ capability of emulating human behaviors has become increasingly sophisticated, making their detection harder. In this paper, we aim at recognizing bot-driven accounts by evaluating the consistency of users’ writing style over time. In particular, we leverage the intuition that while bots compose posts according to fairly deterministic processes, humans are influenced by subjective factors (e.g., emotions) that can alter their writing style. To verify this assumption, by using stylistic consistency indicators, we characterize the writing style of more than 12,000 among bot-driven and human-operated Twitter accounts and find that statistically significant differences can be observed between the different types of users. Thus, we evaluate the effectiveness of different machine learning (ML) algorithms based on stylistic consistency features in discerning between human-operated and bot-driven Twitter accounts and show that the experimented ML algorithms can achieve high performance (i.e., F-measure values up to 98%) in social bot detection tasks.
White, Riley, Sprague, Nathan.  2021.  Deep Metric Learning for Code Authorship Attribution and Verification. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). :1089—1093.
Code authorship identification can assist in identifying creators of malware, identifying plagiarism, and giving insights in copyright infringement cases. Taking inspiration from facial recognition work, we apply recent advances in metric learning to the problem of authorship identification and verification. The metric learning approach makes it possible to measure similarity in the learned embedding space. Access to a discriminative similarity measure allows for the estimation of probability distributions that facilitate open-set classification and verification. We extend our analysis to verification based on sets of files, a previously unexplored problem domain in large-scale author identification. On closed-set tasks we achieve competitive accuracies, but do not improve on the state of the art.
Khan, Aazar Imran, Jain, Samyak, Sharma, Purushottam, Deep, Vikas, Mehrotra, Deepti.  2021.  Stylometric Analysis of Writing Patterns Using Artificial Neural Networks. 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT). :29—35.
Plagiarism checkers have been widely used to verify the authenticity of dissertation/project submissions. However, when non-verbatim plagiarism or online examinations are considered, this practice is not the best solution. In this work, we propose a better authentication system for online examinations that analyses the submitted text's stylometry for a match of writing pattern of the author by whom the text was submitted. The writing pattern is analyzed over many indicators (i.e., features of one's writing style). This model extracts 27 such features and stores them as the writing pattern of an individual. Stylometric Analysis is a better approach to verify a document's authorship as it doesn't check for plagiarism, but verifies if the document was written by a particular individual and hence completely shuts down the possibility of using text-convertors or translators. This paper also includes a brief comparative analysis of some simpler algorithms for the same problem statement. These algorithms yield results that vary in precision and accuracy and hence plotting a conclusion from the comparison shows that the best bet to tackle this problem is through Artificial Neural Networks.
Muldoon, Connagh, Ikram, Ahsan, Khan Mirza, Qublai Ali.  2021.  Modern Stylometry: A Review & Experimentation with Machine Learning. 2021 8th International Conference on Future Internet of Things and Cloud (FiCloud). :293—298.
The problem of authorship attribution has applications from literary studies (such as the great Shakespeare/Marlowe debates) to counter-intelligence. The field of stylometry aims to offer quantitative results for authorship attribution. In this paper, we present a combination of stylometric techniques using machine learning. An implementation of the system is used to analyse chat logs and attempts to construct a stylometric model for users within the presented chat system. This allows for the authorship attribution of other works they may write under different names or within different communication systems. This implementation demonstrates accuracy of up to 84 % across the dataset, a full 34 % increase against a random-choice control baseline.
Frankel, Sophia F., Ghosh, Krishnendu.  2021.  Machine Learning Approaches for Authorship Attribution using Source Code Stylometry. 2021 IEEE International Conference on Big Data (Big Data). :3298—3304.
Identification of source code authorship is vital for attribution. In this work, a machine learning framework is described to identify source code authorship. The framework integrates the features extracted using natural language processing based approaches and abstract syntax tree of the code. We evaluate the methodology on Google Code Jam dataset. We present the performance measures of the logistic regression and deep learning on the dataset.
Teodorescu, Horia-Nicolai.  2021.  Applying Chemical Linguistics and Stylometry for Deriving an Author’s Scientific Profile. 2021 International Symposium on Signals, Circuits and Systems (ISSCS). :1—4.
The study exercises computational linguistics, specifically chemical linguistics methods for profiling an author. We analyze the vocabulary and the style of the titles of the most visible works of Cristofor I. Simionescu, an internationally well-known chemist, for detecting specific patterns of his research interests and methods. Somewhat surprisingly, while the tools used are elementary and there is only a small number of words in the analysis, some interesting details emerged about the work of the analyzed personality. Some of these aspects were confirmed by experts in the field. We believe this is the first study aiming to author profiling in chemical linguistics, moreover the first to question the usefulness of Google Scholar for author profiling.
2020-08-28
Jafariakinabad, Fereshteh, Hua, Kien A..  2019.  Style-Aware Neural Model with Application in Authorship Attribution. 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). :325—328.

Writing style is a combination of consistent decisions associated with a specific author at different levels of language production, including lexical, syntactic, and structural. In this paper, we introduce a style-aware neural model to encode document information from three stylistic levels and evaluate it in the domain of authorship attribution. First, we propose a simple way to jointly encode syntactic and lexical representations of sentences. Subsequently, we employ an attention-based hierarchical neural network to encode the syntactic and semantic structure of sentences in documents while rewarding the sentences which contribute more to capturing the writing style. Our experimental results, based on four benchmark datasets, reveal the benefits of encoding document information from all three stylistic levels when compared to the baseline methods in the literature.

2020-01-27
Zhang, Yiming, Fan, Yujie, Song, Wei, Hou, Shifu, Ye, Yanfang, Li, Xin, Zhao, Liang, Shi, Chuan, Wang, Jiabin, Xiong, Qi.  2019.  Your Style Your Identity: Leveraging Writing and Photography Styles for Drug Trafficker Identification in Darknet Markets over Attributed Heterogeneous Information Network. The World Wide Web Conference. :3448–3454.
Due to its anonymity, there has been a dramatic growth of underground drug markets hosted in the darknet (e.g., Dream Market and Valhalla). To combat drug trafficking (a.k.a. illicit drug trading) in the cyberspace, there is an urgent need for automatic analysis of participants in darknet markets. However, one of the key challenges is that drug traffickers (i.e., vendors) may maintain multiple accounts across different markets or within the same market. To address this issue, in this paper, we propose and develop an intelligent system named uStyle-uID leveraging both writing and photography styles for drug trafficker identification at the first attempt. At the core of uStyle-uID is an attributed heterogeneous information network (AHIN) which elegantly integrates both writing and photography styles along with the text and photo contents, as well as other supporting attributes (i.e., trafficker and drug information) and various kinds of relations. Built on the constructed AHIN, to efficiently measure the relatedness over nodes (i.e., traffickers) in the constructed AHIN, we propose a new network embedding model Vendor2Vec to learn the low-dimensional representations for the nodes in AHIN, which leverages complementary attribute information attached in the nodes to guide the meta-path based random walk for path instances sampling. After that, we devise a learning model named vIdentifier to classify if a given pair of traffickers are the same individual. Comprehensive experiments on the data collections from four different darknet markets are conducted to validate the effectiveness of uStyle-uID which integrates our proposed method in drug trafficker identification by comparisons with alternative approaches.
Yao, Yuanshun, Li, Huiying, Zheng, Haitao, Zhao, Ben Y..  2019.  Latent Backdoor Attacks on Deep Neural Networks. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. :2041–2055.

Recent work proposed the concept of backdoor attacks on deep neural networks (DNNs), where misclassification rules are hidden inside normal models, only to be triggered by very specific inputs. However, these "traditional" backdoors assume a context where users train their own models from scratch, which rarely occurs in practice. Instead, users typically customize "Teacher" models already pretrained by providers like Google, through a process called transfer learning. This customization process introduces significant changes to models and disrupts hidden backdoors, greatly reducing the actual impact of backdoors in practice. In this paper, we describe latent backdoors, a more powerful and stealthy variant of backdoor attacks that functions under transfer learning. Latent backdoors are incomplete backdoors embedded into a "Teacher" model, and automatically inherited by multiple "Student" models through transfer learning. If any Student models include the label targeted by the backdoor, then its customization process completes the backdoor and makes it active. We show that latent backdoors can be quite effective in a variety of application contexts, and validate its practicality through real-world attacks against traffic sign recognition, iris identification of volunteers, and facial recognition of public figures (politicians). Finally, we evaluate 4 potential defenses, and find that only one is effective in disrupting latent backdoors, but might incur a cost in classification accuracy as tradeoff.

Teodorescu, Horia-Nicolai, Bolea, Speranta Cecilia.  2019.  Text Sectioning Based on Stylometric Distances. 2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD). :1–6.
This article continues the stylometric study started in a previous one; we focus on stylometric distances between text segments and the use of these distances in text sectioning based on maximizing the distances between text parts. We refine the method previously introduced and improve on the results. Applications include the automation of stylistic analysis of texts, with implication on text summarization, historical analysis, and authorship analysis.
Tang, Xuemei, Liang, Shichen, Liu, Zhiying.  2019.  Authorship Attribution of The Golden Lotus Based on Text Classification Methods. Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence. :69–72.

In this paper, we explore the authorship attribution of The Golden Lotus using the traditional machine learning method of text classification. There are four candidate authors: Shizhen Wang, Wei Xu, Kaixian Li and Zhideng Wang. We choose The Golden Lotus's poems and four candidate authors' poems as data set. According to the characteristics of Chinese ancient poem, we choose Chinese character, rhyme, genre and overlapped word as features. We use six supervised machine learning algorithms, including Logistic Regression, Random Forests, Decision Tree and Naive Bayes, SVM and KNN classifiers respectively for text binary classification and multi-classification. According to two experiments results, the style of writing of Wei Xu's poems is the most similar to that of The Golden Lotus. It is proved that among four authors, Wei Xu most likely be the author of The Golden Lotus.

Pascucci, Antonio, Masucci, Vincenzo, Monti, Johanna.  2019.  Computational Stylometry and Machine Learning for Gender and Age Detection in Cyberbullying Texts. 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW). :1–6.

The aim of this paper is to show the importance of Computational Stylometry (CS) and Machine Learning (ML) support in author's gender and age detection in cyberbullying texts. We developed a cyberbullying detection platform and we show the results of performances in terms of Precision, Recall and F -Measure for gender and age detection in cyberbullying texts we collected.

Cesar, Pablo, Zwitser, Robert, Webb, Andrew, Ashby, Liam, Ali, Abdallah.  2019.  Uncovering Perceived Identification Accuracy of In-Vehicle Biometric Sensing | Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications: Adjunct Proceedings. AutomotiveUI '19: Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications: Adjunct Proceedings.

Biometric techniques can help make vehicles safer to drive, authenticate users, and provide personalized in-car experiences. However, it is unclear to what extent users are willing to trade their personal biometric data for such benefits. In this early work, we conducted an open card sorting study (N=11) to better understand how well users perceive their physical, behavioral and physiological features can personally identify them. Findings showed that on average participants clustered features into six groups, and helped us revise ambiguous cards and better understand users' clustering. These findings provide the basis for a follow up online closed card sorting study to more fully understand perceived identification accuracy of (in-vehicle) biometric sensing. By uncovering this at a larger scale, we can then further study the privacy and user experience trade-off in (automated) vehicles.

Matyukhina, Alina, Stakhanova, Natalia, Dalla Preda, Mila, Perley, Celine.  2019.  Adversarial Authorship Attribution in Open-Source Projects. Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy. :291–302.

Open-source software is open to anyone by design, whether it is a community of developers, hackers or malicious users. Authors of open-source software typically hide their identity through nicknames and avatars. However, they have no protection against authorship attribution techniques that are able to create software author profiles just by analyzing software characteristics. In this paper we present an author imitation attack that allows to deceive current authorship attribution systems and mimic a coding style of a target developer. Withing this context we explore the potential of the existing attribution techniques to be deceived. Our results show that we are able to imitate the coding style of the developers based on the data collected from the popular source code repository, GitHub. To subvert author imitation attack, we propose a novel author obfuscation approach that allows us to hide the coding style of the author. Unlike existing obfuscation tools, this new obfuscation technique uses transformations that preserve code readability. We assess the effectiveness of our attacks on several datasets produced by actual developers from GitHub, and participants of the GoogleCodeJam competition. Throughout our experiments we show that the author hiding can be achieved by making sensible transformations which significantly reduce the likelihood of identifying the author's style to 0% by current authorship attribution systems.

Gröndahl, Tommi, Asokan, N..  2019.  Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace? ACM Computing Surveys (CSUR). 52:45:1-45:36.

Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to generalize across divergent semantic domains. We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide a superior means of classifying texts as potentially deceptive. Additionally, we discuss forms of deception beyond semantic content, focusing on hiding author identity by writing style obfuscation. Surveying the literature on both author identification and obfuscation techniques, we conclude that current style transformation methods fail to achieve reliable obfuscation while simultaneously ensuring semantic faithfulness to the original text. We propose that future work in style transformation should pay particular attention to disallowing semantically drastic changes.

Farag, Nadine, El-Seoud, Samir Abou, McKee, Gerard, Hassan, Ghada.  2019.  Bullying Hurts: A Survey on Non-Supervised Techniques for Cyber-Bullying Detection. Proceedings of the 2019 8th International Conference on Software and Information Engineering. :85–90.
The contemporary period is scarred by the predominant place of social media in everyday life. Despite social media being a useful tool for communication and social gathering it also offers opportunities for harmful criminal activities. One of these activities is cyber-bullying enabled through the abuse and mistreatment of the internet as a means of bullying others virtually. As a way of minimising this occurrence, research into computer-based researched is carried out to detect cyber-bullying by the scientific research community. An extensive literature search shows that supervised learning techniques are the most commonly used methods for cyber-bullying detection. However, some non-supervised techniques and other approaches have proven to be effective towards cyber-bullying detection. This paper, therefore, surveys recent research on non-supervised techniques and offers some suggestions for future research in textual-based cyber-bullying detection including detecting roles, detecting emotional state, automated annotation and stylometric methods.
Altamimi, Abdulaziz, Clarke, Nathan, Furnell, Steven, Li, Fudong.  2019.  Multi-Platform Authorship Verification. Proceedings of the Third Central European Cybersecurity Conference. :1–7.
At the present time, there has been a rapid increase in the variety and popularity of messaging systems such as social network messaging, text messages, email and Twitter, with users frequently exchanging messages across various platforms. Unfortunately, in amongst the legitimate messages, there is a host of illegitimate and inappropriate content - with cyber stalking, trolling and computerassisted crime all taking place. Therefore, there is a need to identify individuals using messaging systems. Stylometry is the study of linguistic features in a text which consists of verifying an author based on his writing style that consists of checking whether a target text was written or not by a specific individual author. Whilst much research has taken place within authorship verification, studies have focused upon singular platforms, often had limited datasets and restricted methodologies that have meant it is difficult to appreciate the real-world value of the approach. This paper seeks to overcome these limitations through providing an analysis of authorship verification across four common messaging systems. This approach enables a direct comparison of recognition performance and provides a basis for analyzing the feature vectors across platforms to better understand what aspects each capitalize upon in order to achieve good classification. The experiments also include an investigation into the feature vector creation, utilizing population and user-based techniques to compare and contrast performance. The experiment involved 50 participants across four common platforms with a total 13,617; 106,359; 4,539; and 6,540 samples for Twitter, SMS, Facebook, and Email achieving an Equal Error Rate (EER) of 20.16%, 7.97%, 25% and 13.11% respectively.
2019-02-22
Gaston, J., Narayanan, M., Dozier, G., Cothran, D. L., Arms-Chavez, C., Rossi, M., King, M. C., Xu, J..  2018.  Authorship Attribution vs. Adversarial Authorship from a LIWC and Sentiment Analysis Perspective. 2018 IEEE Symposium Series on Computational Intelligence (SSCI). :920-927.

Although Stylometry has been effectively used for Authorship Attribution, there is a growing number of methods being developed that allow authors to mask their identity [2, 13]. In this paper, we investigate the usage of non-traditional feature sets for Authorship Attribution. By using non-traditional feature sets, one may be able to reveal the identity of adversarial authors who are attempting to evade detection from Authorship Attribution systems that are based on more traditional feature sets. In addition, we demonstrate how GEFeS (Genetic & Evolutionary Feature Selection) can be used to evolve high-performance hybrid feature sets composed of two non-traditional feature sets for Authorship Attribution: LIWC (Linguistic Inquiry & Word Count) and Sentiment Analysis. These hybrids were able to reduce the Adversarial Effectiveness on a test set presented in [2] by approximately 33.4%.

Neal, T., Sundararajan, K., Woodard, D..  2018.  Exploiting Linguistic Style as a Cognitive Biometric for Continuous Verification. 2018 International Conference on Biometrics (ICB). :270-276.

This paper presents an assessment of continuous verification using linguistic style as a cognitive biometric. In stylometry, it is widely known that linguistic style is highly characteristic of authorship using representations that capture authorial style at character, lexical, syntactic, and semantic levels. In this work, we provide a contrast to previous efforts by implementing a one-class classification problem using Isolation Forests. Our approach demonstrates the usefulness of this classifier for accurately verifying the genuine user, and yields recognition accuracy exceeding 98% using very small training samples of 50 and 100-character blocks.