Visible to the public Human Behavior and Cyber Vulnerabilities - UMD - January 2017Conflict Detection Enabled

PI(s): VS Subrahmanian
Researchers: Ziyun Zhu, Srijan Kumar, Arunesh Mathur, Noseong Park, Josefine Engel, Brahm Persaud, Sorour Amiri, and Liangzhe Chen (graduate students), and Tudor Dumitras, Marshini Chetty, and Aditya Prakash (faculty)

 

HARD PROBLEM(S) ADDRESSED

Understanding and Accounting for Human Behavior

Security-Metrics-Driven Evaluation, Design, Development, and Deployment

PROJECT SYNOPSIS
When a vulnerability is exploited, software vendors often release patches fixing the vulnerability. However, our prior research has shown that some vulnerabilities continue to be exploited more than four years after their disclosure. Why? We posit that there are both technical and sociological reasons for this. On the technical side, it is unclear how quickly security patches are disseminated, and how long it takes to patch all the vulnerable hosts on the Internet. On the sociological side, users/administrators may decide to delay the deployment of security patches. Our goal in this task is to validate and quantify these explanations. Specifically, we seek to characterize the rate of vulnerability patching, and to determine the factors--both technical and sociological--that influence the rate of applying patches.

PUBLICATIONS

  1. Z. Zhu and T. Dumitras. FeatureSmith: Automatically Engineering Features for Malware Detection by Mining the Security Literature. Accepted at the ACM Conference on Computer and Communications Security (CCS), 2016.

ACCOMPLISHMENT HIGHLIGHTS

The effectiveness of machine-learning techniques, used for security tasks such as malware detection, primarily depends on the manual feature engineering process, based on human knowledge and intuition. However, given attackers’ efforts to evade detection and the growing volume of security reports and publications, the human-driven feature engineering likely draws from a fraction of the relevant knowledge. We developed methods to to engineer such features automatically, by mining natural language documents such as research papers, industry reports and hacker forums. Building on ideas from cognitive psychology, we implemented natural language processing techniques that mirror the human process of reasoning about what malware samples have in common and that address security-specific challenges and opportunities. As a proof of concept, we trained a classifier with automatically engineered features for detecting Android malware, and we achieve a performance comparable to that of a state-of-the-art malware detector, which uses manually engineered features [1]. In addition, our techniques can suggest informative features that are absent from the manually engineered set, and they can link the features generated to human-understandable concepts that describe malware behaviors. 

More information is available at http://www.umiacs.umd.edu/~tdumitra/blog/2016/10/16/automatic-feature-engineering-learning-how-to-detect-malware-by-mining-the-scientific-literature/