Biblio
We present attacks that use only the volume of responses to range queries to reconstruct databases. Our focus is on practical attacks that work for large-scale databases with many values and records, without requiring assumptions on the data or query distributions. Our work improves on the previous state-of-the-art due to Kellaris et al. (CCS 2016) in all of these dimensions. Our main attack targets reconstruction of database counts and involves a novel graph-theoretic approach. It generally succeeds when R , the number of records, exceeds \$N2/2\$, where N is the number of possible values in the database. For a uniform query distribution, we show that it requires volume leakage from only O(N2 łog N) queries (cf. O(N4łog N) in prior work). We present two ancillary attacks. The first identifies the value of a new item added to a database using the volume leakage from fresh queries, in the setting where the adversary knows or has previously recovered the database counts. The second shows how to efficiently recover the ranges involved in queries in an online fashion, given an auxiliary distribution describing the database. Our attacks are all backed with mathematical analyses and extensive simulations using real data.
Recently there has been much interest in performing search queries over encrypted data to enable functionality while protecting sensitive data. One particularly efficient mechanism for executing such queries is order-preserving encryption/encoding (OPE) which results in ciphertexts that preserve the relative order of the underlying plaintexts thus allowing range and comparison queries to be performed directly on ciphertexts. Recently, Popa et al. (SP 2013) gave the first construction of an ideally-secure OPE scheme and Kerschbaum (CCS 2015) showed how to achieve the even stronger notion of frequency-hiding OPE. However, as Naveed et al. (CCS 2015) have recently demonstrated, these constructions remain vulnerable to several attacks. Additionally, all previous ideal OPE schemes (with or without frequency-hiding) either require a large round complexity of O(log n) rounds for each insertion, or a large persistent client storage of size O(n), where n is the number of items in the database. It is thus desirable to achieve a range query scheme addressing both issues gracefully. In this paper, we propose an alternative approach to range queries over encrypted data that is optimized to support insert-heavy workloads as are common in "big data" applications while still maintaining search functionality and achieving stronger security. Specifically, we propose a new primitive called partial order preserving encoding (POPE) that achieves ideal OPE security with frequency hiding and also leaves a sizable fraction of the data pairwise incomparable. Using only O(1) persistent and O(ne) non-persistent client storage for 0(1-e)) search queries. This improved security and performance makes our scheme better suited for today's insert-heavy databases.