Biblio
In today's computerized and information-based society, individuals are constantly presented with vast amounts of text data, ranging from news articles, scientific publications, product reviews, to a wide range of textual information from social media. To extract value from these large, multi-domain pools of text, it is of great importance to gain an understanding of entities and their relationships. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in massive, domain-specific text corpora. These methods can automatically identify token spans as entity mentions in documents and label their fine-grained types (e.g., people, product and food) in a scalable way. Since these methods do not rely on annotated data, predefined typing schema or hand-crafted features, they can be quickly adapted to a new domain, genre and language. We demonstrate on real datasets including various genres (e.g., news articles, discussion forum posts, and tweets), domains (general vs. bio-medical domains) and languages (e.g., English, Chinese, Arabic, and even low-resource languages like Hausa and Yoruba) how these typed entities aid in knowledge discovery and management.
Typing is a human activity that can be affected by a number of situational and task-specific factors. Changes in typing behavior resulting from the manipulation of such factors can be predictably observed through key-level input analytics. Here we present a study designed to explore these relationships. Participants play a typing game in which letter composition, word length and number of words appearing together are varied across levels. Inter-keystroke timings and other higher order statistics (such as bursts and pauses), as well as typing strategies, are analyzed from game logs to find the best set of metrics that quantify the effect that different experimental factors have on observable metrics. Beyond task-specific factors, we also study the effects of habituation by recording changes in performance with practice. Currently a work in progress, this research aims at developing a predictive model of human typing. We believe this insight can lead to the development of novel security proofs for interactive systems that can be deployed on existing infrastructure with minimal overhead. Possible applications of such predictive capabilities include anomalous behavior detection, authentication using typing signatures, bot detection using word challenges etc.