Biblio

Filters: Author is Bauer, Lujo  [Clear All Filters]
2023-01-30
Lin, Weiran, Lucas, Keane, Bauer, Lujo, Reiter, Michael K., Sharif, Mahmood.  2022.  Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks. Proceedings of the 39 th International Conference on Machine Learning.

We propose new, more efficient targeted whitebox attacks against deep neural networks. Our attacks better align with the attacker’s goal: (1) tricking a model to assign higher probability to the target class than to any other class, while (2) staying within an -distance of the attacked input. First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it. Second, we propose a new attack method, Constrained Gradient Descent (CGD), using a refinement of our loss function that captures both (1) and (2). CGD seeks to satisfy both attacker objectives—misclassification and bounded `p-norm—in a principled manner, as part of the optimization, instead of via ad hoc postprocessing techniques (e.g., projection or clipping). We show that CGD is more successful on CIFAR10 (0.9–4.2%) and ImageNet (8.6–13.6%) than state-of-the-art attacks while consuming less time (11.4–18.8%). Statistical tests confirm that our attack outperforms others against leading defenses on different datasets and values of .

2022-01-12
Lin, Weiran, Lucas, Keane, Bauer, Lujo, Reiter, Michael K., Sharif, Mahmood.  2021.  Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks.
Minimal adversarial perturbations added to inputs have been shown to be effective at fooling deep neural networks. In this paper, we introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal: to trick the model to assign a higher probability to the target class than to any other, while staying within a specified distance from the original input. First, we propose a new loss function that explicitly captures the goal of targeted attacks, in particular, by using the logits of all classes instead of just a subset, as is common. We show that Auto-PGD with this loss function finds more adversarial examples than it does with other commonly used loss functions. Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the L∞ distance limit ϵ. This new attack method is relatively 1.5--4.2% more successful on the CIFAR10 dataset and relatively 8.2--14.9% more successful on the ImageNet dataset, than the next best state-of-the-art attack. We confirm using statistical tests that our attack outperforms state-of-the-art attacks on different datasets and values of ϵ and against different defenses.
Lucas, Keane, Sharif, Mahmood, Bauer, Lujo, Reiter, Michael K., Shintre, Saurabh.  2021.  Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes. ASIA CCS '21: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security.
Motivated by the transformative impact of deep neural networks (DNNs) in various domains, researchers and anti-virus vendors have proposed DNNs for malware detection from raw bytes that do not require manual feature engineering. In this work, we propose an attack that interweaves binary-diversification techniques and optimization frameworks to mislead such DNNs while preserving the functionality of binaries. Unlike prior attacks, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white- and black-box settings, and found that it often achieved success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can foil over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by attacks, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.
2020-07-13
Bhagavatula, Sruti, Bauer, Lujo, Kapadia, Apu.  2020.  (How) Do people change their passwords after a breach? Workshop on Technology and Consumer Protection (ConPro 2020).

To protect against misuse of passwords compromised in a breach, consumers should promptly change affected passwords and any similar passwords on other accounts. Ideally, affected companies should strongly encourage this behavior and have mechanisms in place to mitigate harm. In order to make recommendations to companies about how to help their users perform these and other security-enhancing actions after breaches, we must first have some understanding of the current effectiveness of companies’ post-breach practices. To study the effectiveness of password-related breach notifications and practices enforced after a breach, we examine—based on real-world password data from 249 participants—whether and how constructively participants changed their passwords after a breach announcement. Of the 249 participants, 63 had accounts on breached domains; only 33% of the 63 changed their passwords and only 13% (of 63) did so within three months of the announcement. New passwords were on average 1.3× stronger than old passwords (when comparing log10-transformed strength), though most were weaker or of equal strength. Concerningly, new passwords were overall more similar to participants’ other passwords, and participants rarely changed passwords on other sites even when these were the same or similar to their password on the breached domain. Our results highlight the need for more rigorous passwordchanging requirements following a breach and more effective breach notifications that deliver comprehensive advice.

2021-03-09
Sharif, Mahmood, Bauer, Lujo, Reiter, Michael K..  2019.  n-ML: Mitigating adversarial examples via ensembles of topologically manipulated classifiers.. 2019

This paper proposes a new defense called $n$-ML against adversarial examples, i.e., inputs crafted by perturbing benign inputs by small amounts to induce misclassifications by classifiers. Inspired by $n$-version programming, $n$-ML trains an ensemble of $n$ classifiers, and inputs are classified by a vote of the classifiers in the ensemble. Unlike prior such approaches, however, the classifiers in the ensemble are trained specifically to classify adversarial examples differently, rendering it very difficult for an adversarial example to obtain enough votes to be misclassified. We show that $n$-ML roughly retains the benign classification accuracies of state-of-the-art models on the MNIST, CIFAR10, and GTSRB datasets, while simultaneously defending against adversarial examples with better resilience than the best defenses known to date and, in most cases, with lower classification-time overhead.

Sharif, Mahmood, Lucas, Keane, Bauer, Lujo, Reiter, Michael K., Shintre, Saurabh.  2019.  Optimization-guided binary diversification to mislead neural networks for malware detection..

Motivated by the transformative impact of deep neural networks (DNNs) on different areas (e.g., image and speech recognition), researchers and anti-virus vendors are proposing end-to-end DNNs for malware detection from raw bytes that do not require manual feature engineering. Given the security sensitivity of the task that these DNNs aim to solve, it is important to assess their susceptibility to evasion.
In this work, we propose an attack that guides binary-diversification tools via optimization to mislead DNNs for malware detection while preserving the functionality of binaries. Unlike previous attacks on such DNNs, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white-box and black-box settings, and found that it can often achieve success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can successfully prevent over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by adaptive attackers, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.

2019-02-08
Colnago, Jessica, Devlin, Summer, Oates, Maggie, Swoopes, Chelse, Bauer, Lujo, Cranor, Lorrie, Christin, Nicolas.  2018.  "It's Not Actually That Horrible'': Exploring Adoption of Two-Factor Authentication at a University. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. :456:1-456:11.

Despite the additional protection it affords, two-factor authentication (2FA) adoption reportedly remains low. To better understand 2FA adoption and its barriers, we observed the deployment of a 2FA system at Carnegie Mellon University (CMU). We explore user behaviors and opinions around adoption, surrounding a mandatory adoption deadline. Our results show that (a) 2FA adopters found it annoying, but fairly easy to use, and believed it made their accounts more secure; (b) experience with CMU Duo often led to positive perceptions, sometimes translating into 2FA adoption for other accounts; and, (c) the differences between users required to adopt 2FA and those who adopted voluntarily are smaller than expected. We also explore the relationship between different usage patterns and perceived usability, and identify user misconceptions, insecure practices, and design issues. We conclude with recommendations for large-scale 2FA deployments to maximize adoption, focusing on implementation design, use of adoption mandates, and strategic messaging.

2018-07-03
Sharif, Mahmood, Bauer, Lujo, Reiter, Michael K..  2018.  On the Suitability of Lp-norms for Creating and Preventing Adversarial Examples. 2018 IEEE Conference.

Much research has been devoted to better understanding adversarial examples, which are specially crafted inputs to machine-learning models that are perceptually similar to benign inputs, but are classified differently (i.e., misclassified). Both algorithms that create adversarial examples and strategies for defending against adversarial examples typically use Lp-norms to measure the perceptual similarity between an adversarial input and its benign original. Prior work has already shown, however, that two images need not be close to each other as measured by an Lp-norm to be perceptually similar. In this work, we show that nearness according to an Lp-norm is not just unnecessary for perceptual similarity, but is also insufficient. Specifically, focusing on datasets (CIFAR10 and MNIST), Lp-norms, and thresholds used in prior work, we show through online user studies that “adversarial examples” that are closer to their benign counterparts than required by commonly used Lpnorm thresholds can nevertheless be perceptually distinct to humans from the corresponding benign examples. Namely, the perceptual distance between two images that are “near” each other according to an Lp-norm can be high enough that participants frequently classify the two images as representing different objects or digits. Combined with prior work, we thus demonstrate that nearness of inputs as measured by Lp-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples. We propose and discuss alternative similarity metrics to stimulate future research in the area. 

2023-01-30
Sharif, Mahmood, Bauer, Lujo, Reiter, Michael K..  2018.  On the suitability of Lp-norms for creating and preventing adversarial examples. In Proceedings of The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security .

Much research effort has been devoted to better understanding adversarial examples, which are specially crafted inputs to machine-learning models that are perceptually similar to benign inputs, but are classified differently (i.e., misclassified). Both algorithms that create adversarial examples and strategies for defending against them typically use Lp-norms to measure the perceptual similarity between an adversarial input and its benign original. Prior work has already shown, however, that two images need not be close to each other as measured by an Lp-norm to be perceptually similar. In this work, we show that nearness according to an Lp-norm is not just unnecessary for perceptual similarity, but is also insufficient. Specifically, focusing on datasets (CIFAR10 and MNIST), Lp-norms, and thresholds used in prior work, we show through online user studies that "adversarial examples" that are closer to their benign counterparts than required by commonly used Lp-norm thresholds can nevertheless be perceptually different to humans from the corresponding benign examples. Namely, the perceptual distance between two images that are "near" each other according to an Lp-norm can be high enough that participants frequently classify the two images as representing different objects or digits. Combined with prior work, we thus demonstrate that nearness of inputs as measured by Lp-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples. We propose and discuss alternative similarity metrics to stimulate future research in the area.

2018-05-09
Ur, Blase, Alfieri, Felicia, Aung, Maung, Bauer, Lujo, Christin, Nicolas, Colnago, Jessica, Cranor, Lorrie Faith, Dixon, Henry, Emami Naeini, Pardis, Habib, Hana et al..  2017.  Design and Evaluation of a Data-Driven Password Meter. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. :3775–3786.
Despite their ubiquity, many password meters provide inaccurate strength estimates. Furthermore, they do not explain to users what is wrong with their password or how to improve it. We describe the development and evaluation of a data-driven password meter that provides accurate strength measurement and actionable, detailed feedback to users. This meter combines neural networks and numerous carefully combined heuristics to score passwords and generate data-driven text feedback about the user's password. We describe the meter's iterative development and final design. We detail the security and usability impact of the meter's design dimensions, examined through a 4,509-participant online study. Under the more common password-composition policy we tested, we found that the data-driven meter with detailed feedback led users to create more secure, and no less memorable, passwords than a meter with only a bar as a strength indicator.
2017-09-19
Sharif, Mahmood, Bhagavatula, Sruti, Bauer, Lujo, Reiter, Michael K..  2016.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :1528–1540.

Machine learning is enabling a myriad innovations, including new algorithms for cancer diagnosis and self-driving cars. The broad use of machine learning makes it important to understand the extent to which machine-learning algorithms are subject to attack, particularly when used in applications where physical security or safety is at risk. In this paper, we focus on facial biometric systems, which are widely used in surveillance and access control. We define and investigate a novel class of attacks: attacks that are physically realizable and inconspicuous, and allow an attacker to evade recognition or impersonate another individual. We develop a systematic method to automatically generate such attacks, which are realized through printing a pair of eyeglass frames. When worn by the attacker whose image is supplied to a state-of-the-art face-recognition algorithm, the eyeglasses allow her to evade being recognized or to impersonate another individual. Our investigation focuses on white-box face-recognition systems, but we also demonstrate how similar techniques can be used in black-box scenarios, as well as to avoid face detection.

2014-09-17
Mazurek, Michelle L., Komanduri, Saranga, Vidas, Timothy, Bauer, Lujo, Christin, Nicolas, Cranor, Lorrie Faith, Kelley, Patrick Gage, Shay, Richard, Ur, Blase.  2013.  Measuring Password Guessability for an Entire University. Proceedings of the 2013 ACM SIGSAC Conference on Computer &\#38; Communications Security. :173–186.
Despite considerable research on passwords, empirical studies of password strength have been limited by lack of access to plaintext passwords, small data sets, and password sets specifically collected for a research study or from low-value accounts. Properties of passwords used for high-value accounts thus remain poorly understood. We fill this gap by studying the single-sign-on passwords used by over 25,000 faculty, staff, and students at a research university with a complex password policy. Key aspects of our contributions rest on our (indirect) access to plaintext passwords. We describe our data collection methodology, particularly the many precautions we took to minimize risks to users. We then analyze how guessable the collected passwords would be during an offline attack by subjecting them to a state-of-the-art password cracking algorithm. We discover significant correlations between a number of demographic and behavioral factors and password strength. For example, we find that users associated with the computer science school make passwords more than 1.5 times as strong as those of users associated with the business school. while users associated with computer science make strong ones. In addition, we find that stronger passwords are correlated with a higher rate of errors entering them. We also compare the guessability and other characteristics of the passwords we analyzed to sets previously collected in controlled experiments or leaked from low-value accounts. We find more consistent similarities between the university passwords and passwords collected for research studies under similar composition policies than we do between the university passwords and subsets of passwords leaked from low-value accounts that happen to comply with the same policies.
2015-01-12
Ur, Blase, Kelly, Patrick Gage, Komanduri, Saranga, Lee, Joel, Maass, Michael, Mazurek, Michelle, Passaro, Timothy, Shay, Richard, Vidas, Timothy, Bauer, Lujo et al..  2012.  How Does Your Password Measure Up? The Effect of Strength Meters on Password Creation Security'12 Proceedings of the 21st USENIX conference on Security symposium.

To help users create stronger text-based passwords, many web sites have deployed password meters that provide visual feedback on password strength. Although these meters are in wide use, their effects on the security and usability of passwords have not been well studied.

We present a 2,931-subject study of password creation in the presence of 14 password meters. We found that meters with a variety of visual appearances led users to create longer passwords. However, significant increases in resistance to a password-cracking algorithm were only achieved using meters that scored passwords stringently. These stringent meters also led participants to include more digits, symbols, and uppercase letters.

Password meters also affected the act of password creation. Participants who saw stringent meters spent longer creating their password and were more likely to change their password while entering it, yet they were also more likely to find the password meter annoying. However, the most stringent meter and those without visual bars caused participants to place less importance on satisfying the meter. Participants who saw more lenient meters tried to fill the meter and were averse to choosing passwords a meter deemed "bad" or "poor." Our findings can serve as guidelines for administrators seeking to nudge users towards stronger passwords.