Visible to the public Biblio

Filters: Keyword is area under curve  [Clear All Filters]
2017-03-08
Vizer, L. M., Sears, A..  2015.  Classifying Text-Based Computer Interactions for Health Monitoring. IEEE Pervasive Computing. 14:64–71.

Detecting early trends indicating cognitive decline can allow older adults to better manage their health, but current assessments present barriers precluding the use of such continuous monitoring by consumers. To explore the effects of cognitive status on computer interaction patterns, the authors collected typed text samples from older adults with and without pre-mild cognitive impairment (PreMCI) and constructed statistical models from keystroke and linguistic features for differentiating between the two groups. Using both feature sets, they obtained a 77.1 percent correct classification rate with 70.6 percent sensitivity, 83.3 percent specificity, and a 0.808 area under curve (AUC). These results are in line with current assessments for MC–a more advanced disease–but using an unobtrusive method. This research contributes a combination of features for text and keystroke analysis and enhances understanding of how clinicians or older adults themselves might monitor for PreMCI through patterns in typed text. It has implications for embedded systems that can enable healthcare providers and consumers to proactively and continuously monitor changes in cognitive function.

2015-05-04
Pratanwanich, N., Lio, P..  2014.  Who Wrote This? Textual Modeling with Authorship Attribution in Big Data Data Mining Workshop (ICDMW), 2014 IEEE International Conference on. :645-652.

By representing large corpora with concise and meaningful elements, topic-based generative models aim to reduce the dimension and understand the content of documents. Those techniques originally analyze on words in the documents, but their extensions currently accommodate meta-data such as authorship information, which has been proved useful for textual modeling. The importance of learning authorship is to extract author interests and assign authors to anonymous texts. Author-Topic (AT) model, an unsupervised learning technique, successfully exploits authorship information to model both documents and author interests using topic representations. However, the AT model simplifies that each author has equal contribution on multiple-author documents. To overcome this limitation, we assumes that authors give different degrees of contributions on a document by using a Dirichlet distribution. This automatically transforms the unsupervised AT model to Supervised Author-Topic (SAT) model, which brings a novelty of authorship prediction on anonymous texts. The SAT model outperforms the AT model for identifying authors of documents written by either single authors or multiple authors with a better Receiver Operating Characteristic (ROC) curve and a significantly higher Area Under Curve (AUC). The SAT model not only achieves competitive performance to state-of-the-art techniques e.g. Random forests but also maintains the characteristics of the unsupervised models for information discovery i.e. Word distributions of topics, author interests, and author contributions.