Human beings are diverse, and their online behavior is often unpredictable. In today's data-driven world, providers of online services are collecting detailed and comprehensive server-side traces of user activity. These records or logs include detailed, timestamped logs of actions taken by users, often called clickstreams. Given their scale and level of detail, clickstreams present an enormous opportunity for research into user behavioral analysis and modeling. Understanding, modeling and predicting user behavior can dramatically improve the security of today's online systems, while significantly advancing understanding of user behavior. This project develops a general platform for user behavioral modeling using clickstreams, with the goal of providing general tools for modeling user behavior in any application context. If successful, this approach will produce a generalized platform for identifying similar types of user behavior. Prior work using a similar approach already produced significant results in the context of automatically detecting fake accounts and identities in online social networks. The PIs will explore the use of clickstream similarity graphs, graphs designed to capture and model the similarity (or differences) between behavior logs of different users. By applying existing graph analysis techniques, these similarity graphs can identify general user behavioral patterns using semi-supervised learning techniques, and can be used to identify abnormal or unknown user behavior patterns. The researchers will use real detailed clickstreams from two online social networks (Renren and Whisper). The goal of the project is to make clickstream similarity graphs a general and practical user modeling tool. The project will address 3 key challenges. First, it will explore and address challenges of scale in users and trace length, so that the techniques can be applied to large user populations of hundreds of millions. Second, the project will quantify the level of dynamics in user behavior over time, developing techniques to incrementally modify or update user behavior models. Finally, the PIs will study issues in application specificity, i.e., how we can tune the tool for different dimensions of user behavior.
Continuation of Award #: 1527939