Biblio
Despite the additional protection it affords, two-factor authentication (2FA) adoption reportedly remains low. To better understand 2FA adoption and its barriers, we observed the deployment of a 2FA system at Carnegie Mellon University (CMU). We explore user behaviors and opinions around adoption, surrounding a mandatory adoption deadline. Our results show that (a) 2FA adopters found it annoying, but fairly easy to use, and believed it made their accounts more secure; (b) experience with CMU Duo often led to positive perceptions, sometimes translating into 2FA adoption for other accounts; and, (c) the differences between users required to adopt 2FA and those who adopted voluntarily are smaller than expected. We also explore the relationship between different usage patterns and perceived usability, and identify user misconceptions, insecure practices, and design issues. We conclude with recommendations for large-scale 2FA deployments to maximize adoption, focusing on implementation design, use of adoption mandates, and strategic messaging.
Understanding how to group a set of binary files into the piece of software they belong to is highly desirable for software profiling, malware detection, or enterprise audits, among many other applications. Unfortunately, it is also extremely challenging: there is absolutely no uniformity in the ways different applications rely on different files, in how binaries are signed, or in the versioning schemes used across different pieces of software. In this paper, we show that, by combining information gleaned from a large number of endpoints (millions of computers), we can accomplish large-scale application identification automatically and reliably. Our approach relies on collecting metadata on billions of files every day, summarizing it into much smaller "sketches", and performing approximate k-nearest neighbor clustering on non-metric space representations derived from these sketches. We design and implement our proposed system using Apache Spark, show that it can process billions of files in a matter of hours, and thus could be used for daily processing. We further show our system manages to successfully identify which files belong to which application with very high precision, and adequate recall.
We present an architecture for the Security Behavior Observatory (SBO), a client-server infrastructure designed to collect a wide array of data on user and computer behavior from hundreds of participants over several years. The SBO infrastructure had to be carefully designed to fulfill several requirements. First, the SBO must scale with the desired length, breadth, and depth of data collection. Second, we must take extraordinary care to ensure the security of the collected data, which will inevitably include intimate participant behavioral data. Third, the SBO must serve our research interests, which will inevitably change as collected data is analyzed and interpreted. This short paper summarizes some of our design and implementation benefits and discusses a few hurdles and trade-offs to consider when designing such a data collection system.