Visible to the public Biblio

Filters: Keyword is microarchitecture  [Clear All Filters]
2023-05-12
Ponce-de-Leon, Hernán, Kinder, Johannes.  2022.  Cats vs. Spectre: An Axiomatic Approach to Modeling Speculative Execution Attacks. 2022 IEEE Symposium on Security and Privacy (SP). :235–248.

The SPECTRE family of speculative execution attacks has required a rethinking of formal methods for security. Approaches based on operational speculative semantics have made initial inroads towards finding vulnerable code and validating defenses. However, with each new attack grows the amount of microarchitectural detail that has to be integrated into the underlying semantics. We propose an alternative, lightweight and axiomatic approach to specifying speculative semantics that relies on insights from memory models for concurrency. We use the CAT modeling language for memory consistency to specify execution models that capture speculative control flow, store-to-load forwarding, predictive store forwarding, and memory ordering machine clears. We present a bounded model checking framework parameterized by our speculative CAT models and evaluate its implementation against the state of the art. Due to the axiomatic approach, our models can be rapidly extended to allow our framework to detect new types of attacks and validate defenses against them.

ISSN: 2375-1207

2023-03-17
Solanki, Tarun, Panda, Biswabandan.  2022.  SpecPref: High Performing Speculative Attacks Resilient Hardware Prefetchers. 2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST). :57–60.
With the inception of the Spectre attack in 2018, microarchitecture mitigation strategies propose secure cache hi-erarchies that do not leak the speculative state. Among many mitigation strategies, MuonTrap, proposes an efficient, secure cache hierarchy that provides speculative attack resiliency with minimum performance slowdown. Hardware prefetchers play a significant role in improving application performance by fetching and bringing data and instructions into caches before time. To prevent hardware prefetchers from leaking information about the speculative blocks brought into the cache, MuonTrap trains and triggers hardware prefetchers on the committed instruction streams, eliminating speculative state leakage. We find that on-commit prefetching can lead to significant performance slowdown as high as 20.46 % (primarily because of prefetch timeliness issues), making hardware prefetchers less effective. We propose Speculative yet Secure Prefetching (SpecPref), enhancements on top of the MuonTrap hierarchy that allows prefetching both on-commit and speculatively. We focus on improving the performance slowdown with the state-of-the-art hardware prefetchers without compromising the security guarantee provided by the MuonTrap implementation and provide an average performance slowdown of 1.17%.
2022-12-01
Ajorpaz, Samira Mirbagher, Moghimi, Daniel, Collins, Jeffrey Neal, Pokam, Gilles, Abu-Ghazaleh, Nael, Tullsen, Dean.  2022.  EVAX: Towards a Practical, Pro-active & Adaptive Architecture for High Performance & Security. 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). :1218—1236.
This paper provides an end-to-end solution to defend against known microarchitectural attacks such as speculative execution attacks, fault-injection attacks, covert and side channel attacks, and unknown or evasive versions of these attacks. Current defenses are attack specific and can have unacceptably high performance overhead. We propose an approach that reduces the overhead of state-of-art defenses by over 95%, by applying defenses only when attacks are detected. Many current proposed mitigations are not practical for deployment; for example, InvisiSpec has 27% overhead and Fencing has 74% overhead while protecting against only Spectre attacks. Other mitigations carry similar performance penalties. We reduce the overhead for InvisiSpec to 1.26% and for Fencing to 3.45% offering performance and security for not only spectre attacks but other known transient attacks as well, including the dangerous class of LVI and Rowhammer attacks, as well as covering a large set of future evasive and zero-day attacks. Critical to our approach is an accurate detector that is not fooled by evasive attacks and that can generalize to novel zero-day attacks. We use a novel Generative framework, Evasion Vaccination (EVAX) for training ML models and engineering new security-centric performance counters. EVAX significantly increases sensitivity to detect and classify attacks in time for mitigation to be deployed with low false positives (4 FPs in every 1M instructions in our experiments). Such performance enables efficient and timely mitigations, enabling the processor to automatically switch between performance and security as needed.
2022-07-12
Lachtar, Nada, Elkhail, Abdulrahman Abu, Bacha, Anys, Malik, Hafiz.  2021.  An Application Agnostic Defense Against the Dark Arts of Cryptojacking. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :314—325.
The popularity of cryptocurrencies has garnered interest from cybercriminals, spurring an onslaught of cryptojacking campaigns that aim to hijack computational resources for the purpose of mining cryptocurrencies. In this paper, we present a cross-stack cryptojacking defense system that spans the hardware and OS layers. Unlike prior work that is confined to detecting cryptojacking behavior within web browsers, our solution is application agnostic. We show that tracking instructions that are frequently used in cryptographic hash functions serve as reliable signatures for fingerprinting cryptojacking activity. We demonstrate that our solution is resilient to multi-threaded and throttling evasion techniques that are commonly employed by cryptojacking malware. We characterize the robustness of our solution by extensively testing a diverse set of workloads that include real consumer applications. Finally, an evaluation of our proof-of-concept implementation shows minimal performance impact while running a mix of benchmark applications.
2022-06-09
Jung, Wonkyung, Lee, Eojin, Kim, Sangpyo, Kim, Namhoon, Lee, Keewoo, Min, Chohong, Cheon, Jung Hee, Ahn, Jung Ho.  2021.  Accelerating Fully Homomorphic Encryption Through Microarchitecture-Aware Analysis and Optimization. 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). :237–239.
Homomorphic Encryption (HE) [11] draws significant attention as a privacy-preserving way for cloud computing because it allows computation on encrypted messages called ciphertexts. Among numerous FHE schemes [2]–[4], [8], [9], HE for Arithmetic of Approximate Numbers (HEAAN [3]), which is also known as CKKS (Cheon-Kim-Kim-Song), is rapidly gaining popularity [10] as it supports computation on real numbers. A critical shortcoming of HE is the high computational complexity of ciphertext arithmetic, especially, HE multiplication (HE Mul). For example, the execution time for computation on encrypted data (ciphertext) increases from 100s to 10,000s of times compared to that on native, unen-crypted messages. However, a large body of HE acceleration studies, including ones exploiting GPUs and FPGAs, lack a rigorous analysis of computational complexity and data access patterns of HE Mul with large parameter sets on CPUs, the most popular computing platform.
2022-02-04
Biswas, Ananda, Dee, Timothy M., Guo, Yunxi, Li, Zelong, Tyagi, Akhilesh.  2021.  Multi-Granularity Control Flow Anomaly Detection with Hardware Counters. 2021 IEEE 7th World Forum on Internet of Things (WF-IoT). :449—454.
Hardware counters are included in processors to count microarchitecture level events affecting performance. When control flow anomalies caused by attacks such as buffer overflow or return oriented programming (ROP) occur, they leave a microarchitectural footprint. Hardware counters reflect such footprints to flag control flow anomalies. This paper is geared towards buffer overflow and ROP control flow anomaly detection in embedded programs. The targeted program entities are main event loops and task/event handlers. Embedded systems also have enhanced need for variable anomaly detection time in order to meet the system response time requirements. We propose a novel repurposing of Patt-Yeh two level branch predictor data structure for abstracting/hashing HW counter signatures to support such variable anomaly detection times. The proposed anomaly detection mechanism is evaluated on some generic benchmark programs and ArduPilot - a popular autopilot software. Experimental evaluation encompasses both Intel X86 and ARM Cortex M processors. DWT within Cortex M provides sufficiently interesting program level event counts to capture these control flow anomalies. We are able to achieve 97-99%+ accuracy with 1-10 micro-second time overhead per anomaly check.
2021-12-21
He, Zhangying, Miari, Tahereh, Makrani, Hosein Mohammadi, Aliasgari, Mehrdad, Homayoun, Houman, Sayadi, Hossein.  2021.  When Machine Learning Meets Hardware Cybersecurity: Delving into Accurate Zero-Day Malware Detection. 2021 22nd International Symposium on Quality Electronic Design (ISQED). :85–90.
Cybersecurity for the past decades has been in the front line of global attention as a critical threat to the information technology infrastructures. According to recent security reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers as well as harmful purposes to compromise security of computing systems. To address the high complexity and computational overheads of conventional software-based detection techniques, Hardware-Supported Malware Detection (HMD) has proved to be efficient for detecting malware at the processors' microarchitecture level with the aid of Machine Learning (ML) techniques applied on Hardware Performance Counter (HPC) data. Existing ML-based HMDs while accurate in recognizing known signatures of malicious patterns, have not explored detecting unknown (zero-day) malware data at run-time which is a more challenging problem, since its HPC data does not match any known attack applications' signatures in the existing database. In this work, we first present a review of recent ML-based HMDs utilizing built-in HPC registers information. Next, we examine the suitability of various standard ML classifiers for zero-day malware detection and demonstrate that such methods are not capable of detecting unknown malware signatures with high detection rate. Lastly, to address the challenge of run-time zero-day malware detection, we propose an ensemble learning-based technique to enhance the performance of the standard malware detectors despite using a small number of microarchitectural features that are captured at run-time by existing HPCs. The experimental results demonstrate that our proposed approach by applying AdaBoost ensemble learning on Random Forrest classifier as a regular classifier achieves 92% F-measure and 95% TPR with only 2% false positive rate in detecting zero-day malware using only the top 4 microarchitectural features.
2021-09-16
Biswas, Ananda, Li, Zelong, Tyagi, Akhilesh.  2020.  Control Flow Integrity in IoT Devices with Performance Counters and DWT. 2020 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS). :171–176.
IoT devices are open to traditional control flow integrity (CFI) attacks resulting from buffer overflow and return-oriented programming like techniques. They often have limited computational capacity ruling out many of the traditional heavy-duty software countermeasures. In this work, we deploy hardware/software solutions to detect CFI attacks. Some of the medium capability IoT devices, for example based on Raspberry Pi, contain ARM Cortex A-53 (Pi 3) or Cortex A-73 (Pi 4) processors. These processors include hardware counters to count microarchitecture level events affecting performance. Lighter weight IoT devices, say based on ARM Cortex M4 or M7, include DWT (Debug, Watch & Trace) module. When control flow anomalies caused by attacks such as buffer overflow or return oriented programming (ROP) occur, they leave a microarchitectural footprint. Hardware counters reflect such footprints to flag control flow anomalies. This paper is geared towards buffer overflow and ROP control flow anomaly detection in embedded programs. The targeted program entities are main event loops and task/event handlers. The proposed anomaly detection mechanism is evaluated on ArduPilot [1] - a popular autopilot software on a Raspberry Pi 3 with PMU and DWT. A self-navigation program is evaluated on an iCreate Roomba platform with an ARM Cortex M4 processor with DWT only. We are able to achieve 97-99%+ accuracy with 1-10 micro-second time overhead per control flow anomaly check.
Dessouky, Ghada, Frassetto, Tommaso, Jauernig, Patrick, Sadeghi, Ahmad-Reza, Stapf, Emmanuel.  2020.  With Great Complexity Comes Great Vulnerability: From Stand-Alone Fixes to Reconfigurable Security. IEEE Security Privacy. 18:57–66.
The increasing complexity of modern computing devices has rendered security architectures vulnerable to recent side-channel and transient-execution attacks. We discuss the most relevant defenses as well as their drawbacks and how to overcome them for next-generation secure processor design.
Conference Name: IEEE Security Privacy
2021-08-17
Kurth, Michael, Gras, Ben, Andriesse, Dennis, Giuffrida, Cristiano, Bos, Herbert, Razavi, Kaveh.  2020.  NetCAT: Practical Cache Attacks from the Network. 2020 IEEE Symposium on Security and Privacy (SP). :20—38.
Increased peripheral performance is causing strain on the memory subsystem of modern processors. For example, available DRAM throughput can no longer sustain the traffic of a modern network card. Scrambling to deliver the promised performance, instead of transferring peripheral data to and from DRAM, modern Intel processors perform I/O operations directly on the Last Level Cache (LLC). While Direct Cache Access (DCA) instead of Direct Memory Access (DMA) is a sensible performance optimization, it is unfortunately implemented without care for security, as the LLC is now shared between the CPU and all the attached devices, including the network card.In this paper, we reverse engineer the behavior of DCA, widely referred to as Data-Direct I/O (DDIO), on recent Intel processors and present its first security analysis. Based on our analysis, we present NetCAT, the first Network-based PRIME+PROBE Cache Attack on the processor's LLC of a remote machine. We show that NetCAT not only enables attacks in cooperative settings where an attacker can build a covert channel between a network client and a sandboxed server process (without network), but more worryingly, in general adversarial settings. In such settings, NetCAT can enable disclosure of network timing-based sensitive information. As an example, we show a keystroke timing attack on a victim SSH connection belonging to another client on the target server. Our results should caution processor vendors against unsupervised sharing of (additional) microarchitectural components with peripherals exposed to malicious input.
2021-03-17
Huo, T., Wang, W., Zhao, P., Li, Y., Wang, T., Li, M..  2020.  TEADS: A Defense-Aware Framework for Synthesizing Transient Execution Attacks. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :320—327.

Since 2018, a broad class of microarchitectural attacks called transient execution attacks (e.g., Spectre and Meltdown) have been disclosed. By abusing speculative execution mechanisms in modern CPUs, these attacks enable adversaries to leak secrets across security boundaries. A transient execution attack typically evolves through multiple stages, termed the attack chain. We find that current transient execution attacks usually rely on static attack chains, resulting in that any blockage in an attack chain may cause the failure of the entire attack. In this paper, we propose a novel defense-aware framework, called TEADS, for synthesizing transient execution attacks dynamically. The main idea of TEADS is that: each attacking stage in a transient execution attack chain can be implemented in several ways, and the implementations used in different attacking stages can be combined together under certain constraints. By constructing an attacking graph representing combination relationships between the implementations and testing available paths in the attacking graph dynamically, we can finally synthesize transient execution attacks which can bypass the imposed defense techniques. Our contributions include: (1) proposing an automated defense-aware framework for synthesizing transient execution attacks, even though possible combinations of defense strategies are enabled; (2) presenting an attacking graph extension algorithm to detect potential attack chains dynamically; (3) implementing TEADS and testing it on several modern CPUs with different protection settings. Experimental results show that TEADS can bypass the defenses equipped, improving the adaptability and durability of transient execution attacks.

2021-03-15
Silitonga, A., Gassoumi, H., Becker, J..  2020.  MiteS: Software-based Microarchitectural Attacks and Countermeasures in networked AP SoC Platforms. 2020 IEEE 14th International Conference on Anti-counterfeiting, Security, and Identification (ASID). :65—71.

The impact of microarchitectural attacks in Personal Computers (PCs) can be further adapted to and observed in internetworked All Programmable System-on-Chip (AP SoC) platforms. This effort involves the access control or execution of Intellectual Property cores in the FPGA of an AP SoC Victim internetworked with an AP SoC Attacker via Internet Protocol (IP). Three conceptions of attacks were implemented: buffer overflow attack at the stack, return-oriented programming attack, and command-injection-based attack for dynamic reconfiguration in the FPGA. Indeed, a specific preventive countermeasure for each attack is proposed. The functionality of the countermeasures mainly comprises adapted words addition (stack protection) for the first and second attacks and multiple encryption for the third attack. In conclusion, the recommended countermeasures are realizable to counteract the implemented attacks.

Chowdhuryy, M. H. Islam, Liu, H., Yao, F..  2020.  BranchSpec: Information Leakage Attacks Exploiting Speculative Branch Instruction Executions. 2020 IEEE 38th International Conference on Computer Design (ICCD). :529–536.
Recent studies on attacks exploiting processor hardware vulnerabilities have raised significant concern for information security. Particularly, transient execution attacks such as Spectre augment microarchitectural side channels with speculative executions that lead to exfiltration of secretive data not intended to be accessed. Many prior works have demonstrated the manipulation of branch predictors for triggering speculative executions, and thereafter leaking sensitive information through processor microarchitectural components. In this paper, we present a new class of microarchitectural attack, called BranchSpec, that performs information leakage by exploiting state changes of branch predictors in speculative path. Our key observation is that, branch instruction executions in speculative path alter the states of branch pattern history, which are not restored even after the speculatively executed branches are eventually squashed. Unfortunately, this enables adversaries to harness branch predictors as the transmitting medium in transient execution attacks. More importantly, as compared to existing speculative attacks (e.g., Spectre), BranchSpec can take advantage of much simpler code patterns in victim's code base, making the impact of such exploitation potentially even more severe. To demonstrate this security vulnerability, we have implemented two variants of BranchSpec attacks: a side channel where a malicious spy process infers cross-boundary secrets via victim's speculatively executed nested branches, and a covert channel that communicates secrets through intentionally perturbing the branch pattern history structure via speculative branch executions. Our evaluation on Intel Skylake- and Coffee Lake-based processors reveals that these information leakage attacks are highly accurate and successful. To the best of our knowledge, this is the first work to reveal the information leakage threat due to speculative state update in branch predictor. Our studies further broaden the attack surface of processor microarchitecture, and highlight the needs for branch prediction mechanisms that are secure in transient executions.
2021-03-04
Wang, H., Sayadi, H., Kolhe, G., Sasan, A., Rafatirad, S., Homayoun, H..  2020.  Phased-Guard: Multi-Phase Machine Learning Framework for Detection and Identification of Zero-Day Microarchitectural Side-Channel Attacks. 2020 IEEE 38th International Conference on Computer Design (ICCD). :648—655.

Microarchitectural Side-Channel Attacks (SCAs) have emerged recently to compromise the security of computer systems by exploiting the existing processors' hardware vulnerabilities. In order to detect such attacks, prior studies have proposed the deployment of low-level features captured from built-in Hardware Performance Counter (HPC) registers in modern microprocessors to implement accurate Machine Learning (ML)-based SCAs detectors. Though effective, such attack detection techniques have mainly focused on binary classification models offering limited insights on identifying the type of attacks. In addition, while existing SCAs detectors required prior knowledge of attacks applications to detect the pattern of side-channel attacks using a variety of microarchitectural features, detecting unknown (zero-day) SCAs at run-time using the available HPCs remains a major challenge. In response, in this work we first identify the most important HPC features for SCA detection using an effective feature reduction method. Next, we propose Phased-Guard, a two-level machine learning-based framework to accurately detect and classify both known and unknown attacks at run-time using the most prominent low-level features. In the first level (SCA Detection), Phased-Guard using a binary classification model detects the existence of SCAs on the target system by determining the critical scenarios including system under attack and system under no attack. In the second level (SCA Identification) to further enhance the security against side-channel attacks, Phased-Guard deploys a multiclass classification model to identify the type of SCA applications. The experimental results indicate that Phased-Guard by monitoring only the victim applications' microarchitectural HPCs data, achieves up to 98 % attack detection accuracy and 99.5% SCA identification accuracy significantly outperforming the state-of-the-art solutions by up to 82 % in zero-day attack detection at the cost of only 4% performance overhead for monitoring.

2020-04-03
Cheang, Kevin, Rasmussen, Cameron, Seshia, Sanjit, Subramanyan, Pramod.  2019.  A Formal Approach to Secure Speculation. 2019 IEEE 32nd Computer Security Foundations Symposium (CSF). :288—28815.
Transient execution attacks like Spectre, Meltdown and Foreshadow have shown that combinations of microarchitectural side-channels can be synergistically exploited to create side-channel leaks that are greater than the sum of their parts. While both hardware and software mitigations have been proposed against these attacks, provable security has remained elusive. This paper introduces a formal methodology for enabling secure speculative execution on modern processors. We propose a new class of information flow security properties called trace property-dependent observational determinism (TPOD). We use this class to formulate a secure speculation property. Our formulation precisely characterises all transient execution vulnerabilities. We demonstrate its applicability by verifying secure speculation for several illustrative programs.
2019-10-14
Kocher, P., Horn, J., Fogh, A., Genkin, D., Gruss, D., Haas, W., Hamburg, M., Lipp, M., Mangard, S., Prescher, T. et al..  2019.  Spectre Attacks: Exploiting Speculative Execution. 2019 IEEE Symposium on Security and Privacy (SP). :1–19.

Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try to guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing and side-channel attacks. These attacks represent a serious threat to actual systems since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.

Tymburibá, M., Sousa, H., Pereira, F..  2019.  Multilayer ROP Protection Via Microarchitectural Units Available in Commodity Hardware. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :315–327.

This paper presents a multilayer protection approach to guard programs against Return-Oriented Programming (ROP) attacks. Upper layers validate most of a program's control flow at a low computational cost; thus, not compromising runtime. Lower layers provide strong enforcement guarantees to handle more suspicious flows; thus, enhancing security. Our multilayer system combines techniques already described in the literature with verifications that we introduce in this paper. We argue that modern versions of x86 processors already provide the microarchitectural units necessary to implement our technique. We demonstrate the effectiveness of our multilayer protection on a extensive suite of benchmarks, which includes: SPEC CPU2006; the three most popular web browsers; 209 benchmarks distributed with LLVM and four well-known systems shown to be vulnerable to ROP exploits. Our experiments indicate that we can protect programs with almost no overhead in practice, allying the good performance of lightweight security techniques with the high dependability of heavyweight approaches.

2019-02-13
Van Bulck, Jo, Piessens, Frank, Strackx, Raoul.  2018.  Nemesis: Studying Microarchitectural Timing Leaks in Rudimentary CPU Interrupt Logic. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :178–195.
Recent research on transient execution vulnerabilities shows that current processors exceed our levels of understanding. The prominent Meltdown and Spectre attacks abruptly revealed fundamental design flaws in CPU pipeline behavior and exception handling logic, urging the research community to systematically study attack surface from microarchitectural interactions. We present Nemesis, a previously overlooked side-channel attack vector that abuses the CPU's interrupt mechanism to leak microarchitectural instruction timings from enclaved execution environments such as Intel SGX, Sancus, and TrustLite. At its core, Nemesis abuses the same subtle microarchitectural behavior that enables Meltdown, i.e., exceptions and interrupts are delayed until instruction retirement. We show that by measuring the latency of a carefully timed interrupt, an attacker controlling the system software is able to infer instruction-granular execution state from hardware-enforced enclaves. In contrast to speculative execution vulnerabilities, our novel attack vector is applicable to the whole computing spectrum, from small embedded sensor nodes to high-end commodity x86 hardware. We present practical interrupt timing attacks against the open-source Sancus embedded research processor, and we show that interrupt latency reveals microarchitectural instruction timings from off-the-shelf Intel SGX enclaves. Finally, we discuss challenges for mitigating Nemesis-type attacks at the hardware and software levels.