Biblio

Filters: Author is Hua, Yu  [Clear All Filters]
2022-03-22
Huang, Jianming, Hua, Yu.  2021.  A Write-Friendly and Fast-Recovery Scheme for Security Metadata in Non-Volatile Memories. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). :359—370.
Non-Volatile Memories (NVMs) require security mechanisms, e.g., counter mode encryption and integrity tree verification, which are important to protect systems in terms of encryption and data integrity. These security mechanisms heavily rely on extra security metadata that need to be efficiently and accurately recovered after system crashes or power off. Established SGX integrity tree (SIT) becomes efficient to protect system integrity and however fails to be restored from leaves, since the computations of SIT nodes need their parent nodes as inputs. To recover the security metadata with low write overhead and short recovery time, we propose an efficient and instantaneous persistence scheme, called STAR, which instantly persists the modifications of security metadata without extra memory writes. STAR is motivated by our observation that the parent nodes in cache are modified due to persisting their child nodes. STAR stores the modifications of parent nodes in their child nodes and persists them just using one atomic memory write. To eliminate the overhead of persisting the modifications, STAR coalesces the modifications and MACs in the evicted metadata. For fast recovery and verification of the metadata, STAR uses bitmap lines in asynchronous DRAM refresh (ADR) to indicate the locations of stale metadata, and constructs a cached merkle tree to verify the correctness of the recovery process. Our evaluation results show that compared with state-of-the-art work, our proposed STAR delivers high performance, low write traffic, low energy consumption and short recovery time.
2018-06-11
Sun, Yuanyuan, Hua, Yu, Liu, Xue, Cao, Shunde, Zuo, Pengfei.  2017.  DLSH: A Distribution-aware LSH Scheme for Approximate Nearest Neighbor Query in Cloud Computing. Proceedings of the 2017 Symposium on Cloud Computing. :242–255.
Cloud computing needs to process and analyze massive high-dimensional data in a real-time manner. Approximate queries in cloud computing systems can provide timely queried results with acceptable accuracy, thus alleviating the consumption of a large amount of resources. Locality Sensitive Hashing (LSH) is able to maintain the data locality and support approximate queries. However, due to randomly choosing hash functions, LSH has to use too many functions to guarantee the query accuracy. The extra computation and storage overheads exacerbate the real performance of LSH. In order to reduce the overheads and deliver high performance, we propose a distribution-aware scheme, called DLSH, to offer cost-effective approximate nearest neighbor query service for cloud computing. The idea of DLSH is to leverage the principal components of the data distribution as the projection vectors of hash functions in LSH, further quantify the weight of each hash function and adjust the interval value in each hash table. We then refine the queried result set based on the hit frequency to significantly decrease the time overhead of distance computation. Extensive experiments in a large-scale cloud computing testbed demonstrate significant improvements in terms of multiple system performance metrics. We have released the source code of DLSH for public use.
2018-04-11
Zuo, Pengfei, Hua, Yu, Wang, Cong, Xia, Wen, Cao, Shunde, Zhou, Yukun, Sun, Yuanyuan.  2017.  Mitigating Traffic-Based Side Channel Attacks in Bandwidth-Efficient Cloud Storage. Proceedings of the 2017 Symposium on Cloud Computing. :638–638.

Data deduplication [3] is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in cloud storage systems to save the users' network bandwidth for uploading data. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. The attacker can carry out a very dangerous side channel attack, i.e., learn-the-remaining-information (LRI) attack, to reveal users' privacy information by exploiting the side channel of network traffic in deduplication [1]. In the LRI attack, the attacker knows a large part of the target file in the cloud and tries to learn the remaining unknown parts via uploading all possible versions of the file's content. For example, the attacker knows all the contents of the target file X except the sensitive information \texttheta. To learn the sensitive information, the attacker needs to upload m files with all possible values of \texttheta, respectively. If a file Xd with the value \textthetad is deduplicated and other files are not, the attacker knows that the information \texttheta = \textthetad. In the threat model of the LRI attack, we consider a general cloud storage service model that includes two entities, i.e., the user and cloud storage server. The attack is launched by the users who aim to steal the privacy information of other users [1]. The attacker can act as a user via its own account or use multiple accounts to disguise as multiple users. The cloud storage server communicates with the users through Internet. The connections from the clients to the cloud storage server are encrypted by SSL or TLS protocol. Hence, the attacker can monitor and measure the amount of network traffic between the client and server but cannot intercept and analyze the contents of the transmitted data due to the encryption. The attacker can then perform the sophisticated traffic analysis with sufficient computing resources. We propose a simple yet effective scheme, called randomized redundant chunk scheme (RRCS), to significantly mitigate the risk of the LRI attack while maintaining the high bandwidth efficiency of deduplication. The basic idea behind RRCS is to add randomized redundant chunks to mix up the real deduplication states of files used for the LRI attack, which effectively obfuscates the view of the attacker, who attempts to exploit the side channel of network traffic for the LRI attack. RRCS includes three key function modules, range generation (RG), secure bounds setting (SBS), and security-irrelevant redundancy elimination (SRE). When uploading the random-number redundant chunks, RRCS first uses RG to generate a fixed range [0,$łambda$N] ($łambda$ $ε$ (0,1]), in which the number of added redundant chunks is randomly chosen, where N is the total number of chunks in a file and $łambda$ is a system parameter. However, the fixed range may cause a security issue. SBS is used to deal with the bounds of the fixed range to avoid the security issue. There may exist security-irrelevant redundant chunks in RRCS. SRE reduces the security-irrelevant redundant chunks to improve the deduplication efficiency. The design details are presented in our technical report [5]. Our security analysis demonstrates RRCS can significantly reduce the risk of the LRI attack [5]. We examine the performance of RRCS using three real-world trace-based datasets, i.e., Fslhomes [2], MacOS [2], and Onefull [4], and compare RRCS with the randomized threshold scheme (RTS) [1]. Our experimental results show that source-based deduplication eliminates 100% data redundancy which however has no security guarantee. File-level (chunk-level) RTS only eliminates 8.1% – 16.8% (9.8% – 20.3%) redundancy, due to only eliminating the redundancy of the files (chunks) that have many copies. RRCS with $łambda$ = 0.5 eliminates 76.1% – 78.0% redundancy and RRCS with $łambda$ = 1 eliminates 47.9% – 53.6% redundancy.