USE: User Security Behavior (CMU/Berkeley/University of Pittsburgh Collaborative Proposal)
Public Audience
Purpose: To highlight progress. Information is generally at a higher level which is accessible to the interested public.
PI(s): A. Acquisti, L.F. Cranor, N. Christin, R. Telang
Researchers: Alain Forget (CMU), Serge Egelman (Berkeley), and Scott Beach (Univ of Pittsburgh)
1) HARD PROBLEM(S) ADDRESSED (with short descriptions)
This refers to Hard Problems, released November 2012.
5. Understanding and Accounting for Human Behavior
The Security Behavior Observatory addresses the hard problem of "Understanding and Accounting for Human Behavior" by collecting data directly from people's own home computers, thereby capturing people's computing behavior "in the wild". This data is the closest to the ground truth of the users' everyday security and privacy challenges that the research community has ever collected. We expect the insights discovered by analyzing this data will profoundly impact multiple research domains, including but not limited to behavioral sciences, computer security & privacy, economics, and human-computer interaction.
2) PUBLICATIONS
Report papers written as a results of this research. If accepted by or submitted to a journal, which journal. If presented at a conference, which conference.
A. Forget, S. Komanduri, A. Acquisti, N. Christin, L.F. Cranor, R. Telang. "Security Behavior Observatory: Infrastructure for Long-term Monitoring of Client Machines." Carnegie Mellon University CyLab Technical Report CMU-CyLab-14-009. https://www.cylab.cmu.edu/research/techreports/2014/tr_cylab14009.html (accessed 2014-09-05)
A. Forget, S. Komanduri, A. Acquisti, N. Christin, L.F. Cranor, R. Telang (2014). Building the Security Behavior Observatory: An Infrastructure for Long-term Monitoring of Client Machines. Invited talk and poster at the IEEE Symposium and Bootcamp on the Science of Security (HotSoS) 2014.
By its very nature - building infrastructure to collect data, then collecting, and eventually analyzing the data - the project has a long set up phase. As a result, it will likely be much more publication-centered toward the second half of its projected duration. However, we are confident that the greater number and quality of sensors we are building, and the more secure, reliable, and robust infrastructure we continue to build will provide more and better data, resulting in more and stronger publications.
However, we are currently performing data analysis on validating our UC Berkeley collaborator's Security Behavior Intentions Scale with early data from the SBO, which we hope to submit for peer-review and publication soon. Time willing, we may also submit a seminal paper solely about the SBO infrastructure with some early results. Finally, we also hope to compile the lessons learnt about building and launching such a large-scale field study into another publication.
3) KEY HIGHLIGHTS
1) The successful beta-test pilot study of our data collection architecture, demonstrating the reliability and stability of our infrastructure, has led to the launch of our main study by recruiting participants from the general population. Some technical challenges were encountered, but they have been corrected and we will soon push out a new major version of our software with greater remote control functionality.
2) A registry sensor, improved filesystem sensor, improved process sensor will very shortly be ready for deployment, alongside our existing data collection sensors tracking client machines' network packet headers, Windows security logs, Windows updates, installed software, web browsing settings and behaviour, and wireless access points.
3) We have begun deploying an instance of the Cloudera Distribution including Apache Hadoop (CDH) for our long-term data storage needs, since Cloudera because it is the only Hadoop distribution supporting encryption-at-rest, which we require to protect our clients' very sensitive data. Furthermore, Hadoop-based solutions can handle large amounts of data, are well-known and supported, can be supplemented with additional data querying functionality (e.g., Hive, Pig) and other distributed data structures (e.g., HBase, Cassandra), and it seems relatively straightforward to export subsets of the data for more specific analyses that may be more easily performed with other solutions (e.g., document databases, graph databases, R).
4) We have begun an early data analysis project comparing how users' self-reported security behavior intentions correlate with their actual behaviors on their machines that we are monitoring. We soon hope to submit these data analysis results for peer-review and publication.