Visible to the public Biblio

Filters: Author is Zhang, Lixin  [Clear All Filters]
2023-01-13
Zhao, Lutan, Li, Peinan, HOU, RUI, Huang, Michael C., Qian, Xuehai, Zhang, Lixin, Meng, Dan.  2022.  HyBP: Hybrid Isolation-Randomization Secure Branch Predictor. 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). :346—359.
Recently exposed vulnerabilities reveal the necessity to improve the security of branch predictors. Branch predictors record history about the execution of different processes, and such information from different processes are stored in the same structure and thus accessible to each other. This leaves the attackers with the opportunities for malicious training and malicious perception. Physical or logical isolation mechanisms such as using dedicated tables and flushing during context-switch can provide security but incur non-trivial costs in space and/or execution time. Randomization mechanisms incurs the performance cost in a different way: those with higher securities add latency to the critical path of the pipeline, while the simpler alternatives leave vulnerabilities to more sophisticated attacks.This paper proposes HyBP, a practical hybrid protection and effective mechanism for building secure branch predictors. The design applies the physical isolation and randomization in the right component to achieve the best of both worlds. We propose to protect the smaller tables with physically isolation based on (thread, privilege) combination; and protect the large tables with randomization. Surprisingly, the physical isolation also significantly enhances the security of the last-level tables by naturally filtering out accesses, reducing the information flow to these bigger tables. As a result, key changes can happen less frequently and be performed conveniently at context switches. Moreover, we propose a latency hiding design for a strong cipher by precomputing the "code book" with a validated, cryptographically strong cipher. Overall, our design incurs a performance penalty of 0.5% compared to 5.1% of physical isolation under the default context switching interval in Linux.
2021-05-05
Zhu, Jianping, HOU, RUI, Wang, XiaoFeng, Wang, Wenhao, Cao, Jiangfeng, Zhao, Boyan, Wang, Zhongpu, Zhang, Yuhui, Ying, Jiameng, Zhang, Lixin et al..  2020.  Enabling Rack-scale Confidential Computing using Heterogeneous Trusted Execution Environment. 2020 IEEE Symposium on Security and Privacy (SP). :1450—1465.

With its huge real-world demands, large-scale confidential computing still cannot be supported by today's Trusted Execution Environment (TEE), due to the lack of scalable and effective protection of high-throughput accelerators like GPUs, FPGAs, and TPUs etc. Although attempts have been made recently to extend the CPU-like enclave to GPUs, these solutions require change to the CPU or GPU chips, may introduce new security risks due to the side-channel leaks in CPU-GPU communication and are still under the resource constraint of today's CPU TEE.To address these problems, we present the first Heterogeneous TEE design that can truly support large-scale compute or data intensive (CDI) computing, without any chip-level change. Our approach, called HETEE, is a device for centralized management of all computing units (e.g., GPUs and other accelerators) of a server rack. It is uniquely designed to work with today's data centres and clouds, leveraging modern resource pooling technologies to dynamically compartmentalize computing tasks, and enforce strong isolation and reduce TCB through hardware support. More specifically, HETEE utilizes the PCIe ExpressFabric to allocate its accelerators to the server node on the same rack for a non-sensitive CDI task, and move them back into a secure enclave in response to the demand for confidential computing. Our design runs a thin TCB stack for security management on a security controller (SC), while leaving a large set of software (e.g., AI runtime, GPU driver, etc.) to the integrated microservers that operate enclaves. An enclaves is physically isolated from others through hardware and verified by the SC at its inception. Its microserver and computing units are restored to a secure state upon termination.We implemented HETEE on a real hardware system, and evaluated it with popular neural network inference and training tasks. Our evaluations show that HETEE can easily support the CDI tasks on the real-world scale and incurred a maximal throughput overhead of 2.17% for inference and 0.95% for training on ResNet152.