Biblio
The security field relies on user studies, often including survey questions, to query end users' general security behavior and experiences, or hypothetical responses to new messages or tools. Self-report data has many benefits – ease of collection, control, and depth of understanding – but also many well-known biases stemming from people's difficulty remembering prior events or predicting how they might behave, as well as their tendency to shape their answers to a perceived audience. Prior work in fields like public health has focused on measuring these biases and developing effective mitigations; however, there is limited evidence as to whether and how these biases and mitigations apply specifically in a computer-security context. In this work, we systematically compare real-world measurement data to survey results, focusing on an exemplar, well-studied security behavior: software updating. We align field measurements about specific software updates (n=517,932) with survey results in which participants respond to the update messages that were used when those versions were released (n=2,092). This allows us to examine differences in self-reported and observed update speeds, as well as examining self-reported responses to particular message features that may correlate with these results. The results indicate that for the most part, self-reported data varies consistently and systematically with measured data. However, this systematic relationship breaks down when survey respondents are required to notice and act on minor details of experimental manipulations. Our results suggest that many insights from self-report security data can, when used with care, translate to real-world environments; however, insights about specific variations in message texts or other details may be more difficult to assess with surveys.
Malware detection increasingly relies on machine learning techniques, which utilize multiple features to separate the malware from the benign apps. The effectiveness of these techniques primarily depends on the manual feature engineering process, based on human knowledge and intuition. However, given the adversaries' efforts to evade detection and the growing volume of publications on malware behaviors, the feature engineering process likely draws from a fraction of the relevant knowledge. We propose an end-to-end approach for automatic feature engineering. We describe techniques for mining documents written in natural language (e.g. scientific papers) and for representing and querying the knowledge about malware in a way that mirrors the human feature engineering process. Specifically, we first identify abstract behaviors that are associated with malware, and then we map these behaviors to concrete features that can be tested experimentally. We implement these ideas in a system called FeatureSmith, which generates a feature set for detecting Android malware. We train a classifier using these features on a large data set of benign and malicious apps. This classifier achieves a 92.5% true positive rate with only 1% false positives, which is comparable to the performance of a state-of-the-art Android malware detector that relies on manually engineered features. In addition, FeatureSmith is able to suggest informative features that are absent from the manually engineered set and to link the features generated to abstract concepts that describe malware behaviors.