Visible to the public Reasoning about Accidental and Malicious Misuse via Formal MethodsConflict Detection Enabled

PI(s), Co-PI(s), Researchers:

PI: Munindar Singh; Co-PIs: William Enck, Laurie Williams; Researchers: Hui Guo, Samin Yaseer Mahmud, Md Rayhanur Rahman, Vaibhav Garg

HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.

  • Policy

This project seeks to aid security analysts in identifying and protecting against accidental and malicious actions by users or software through automated reasoning on unified representations of user expectations and software implementations to identify misuses sensitive to usage and machine context.

PUBLICATIONS

Munindar P. Singh. "Consent as a Foundation for Responsible Autonomy." Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI) 36, February 2022. Blue Sky Track.

Rahman, Rayhanur, Imtiaz, Nasif, Storey, Margaret-Anne, and Williams, Laurie. "Why secret detection tools are not enough: It's not just about false positives - An industrial case study", Empirical Software Engineering, to appear [will upload when paper is online]

KEY HIGHLIGHTS

Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making.

  • We identified key challenges and directions for a proper model of consent by which AI agents can act responsibly and assist users.

  • We enhanced and compared our framework which identifies rogue apps, with several keyword-based baselines on app reviews and descriptions. iRogue produces the best recall (90.47%) and F1 (78.08%) values among all approaches.

  • We have started a project that deals with identifying various kinds of misbehavior (on mobile apps), such as asking for inappropriate messages, threatening other users. We train a classifier on the dataset of app reviews to identify incidents of online misbehavior. Based on our results, Universal Sentence Encoder with LinearSVC classifier gives the best performance, of 83% recall at 92% precision.

  • We use comparative sentences in app reviews to identify user expectations and preferences between two comparative products. To this effect, we created a manually-annotated gold standard dataset containing ~9k comparative sentences annotated for the type of comparison (implicit or explicit) and the entity preferred in the comparison. We experimented with traditional and deep learning-based models and achieved results better than the existing best performing model for the task.

  • We built a dataflow-based static program analysis tool to study how Payment Service Provider (PSP) libraries for mobile Android apps store security-critical information.

  • We conducted a comparison study of three Natural Language Processing (NLP)/Machine Learning (ML) models for extracting attacker techniques from CTI.

  • We studied the impact of credential/secret detection tools on developer behavior in a large software company.

COMMUNITY ENGAGEMENTS

None.

EDUCATIONAL ADVANCES:

We engaged a female undergraduate student on this project.