Securing Safety-Critical Machine Learning Algorithms - January 2023

Submitted by Jamie Presken on Tue, 12/13/2022 - 3:24pm

PI(s), Co-PI(s), Researchers: Lujo Bauer, Matt Fredrikson (CMU), Mike Reiter (UNC)

HARD PROBLEM(S) ADDRESSED

This project addresses the following hard problems: developing security metrics and developing resilient architectures. Both problems are tackled in the context of deep neural networks, which are a particularly popular and performant type of machine learning algorithm. This project develops metrics that characterize the degree to which a neural-network-based classifier can be evaded through practically realizable, inconspicuous attacks. The project also develops architectures for neural networks that would make them robust to adversarial examples.

PUBLICATIONS

N/A this quarter

PUBLIC ACCOMPLISHMENT HIGHLIGHTS

ML models have shown promise in classifying raw executable files (binaries) as malicious or benign with high accuracy. This has led to the increasing influence of ML-based classification methods in academic and real-world malware detection, a critical tool in cybersecurity. However, previous work provoked caution by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection.

We investigated the effectiveness of using adversarial training methods to create malware classification models that are more robust to some state-of-the-art attacks. To train our most robust models, we significantly increased the efficiency and scale of creating adversarial examples to make adversarial training practical, a first for raw-binary malware detectors. We then analyzed the effects of varying the length of adversarial training and various versions of attacks to train with. We found that data augmentation does not deter state-of-the-art attacks, but using a generic gradient-guided method used in other discrete domains does improve robustness. We also showed that in most cases, models can be made more robust to malware-domain attacks by adversarially training with lower-effort versions of the same attack. In the best case, we reduced one state-of-the-art attack's success rate from 90% to 5%. We also found that training with some attacks can increase robustness to other types of attacks.

Adversarial training for raw-binary malware classifiers.

Keane Lucas, Samruddhi Pai, Weiran Lin, Lujo Bauer, Michael K. Reiter, Mahmood Sharif.

In Proceedings of the 32nd USENIX Security Symposium, August 2023. To appear.

COMMUNITY ENGAGEMENTS (If applicable)

No new data

EDUCATIONAL ADVANCES (If applicable)

N/A this quarter

Groups:

Cyber-Physical Systems Virtual Organization

Read-only archive of site from September 29, 2023.

Securing Safety-Critical Machine Learning Algorithms - January 2023