Biblio
With the massive amounts of data available today, it is common to store and process data using multiple machines. Parallel programming platforms such as MapReduce and its variants are popular frameworks for handling such large data. We present the first provably efficient algorithms to compute, store, and query data structures for range queries and approximate nearest neighbor queries in a popular parallel computing abstraction that captures the salient features of MapReduce and other massively parallel communication (MPC) models. In particular, we describe algorithms for \$kd\$-trees, range trees, and BBD-trees that only require O(1) rounds of communication for both preprocessing and querying while staying competitive in terms of running time and workload to their classical counterparts. Our algorithms are randomized, but they can be made deterministic at some increase in their running time and workload while keeping the number of rounds of communication to be constant.
Recently there has been much interest in performing search queries over encrypted data to enable functionality while protecting sensitive data. One particularly efficient mechanism for executing such queries is order-preserving encryption/encoding (OPE) which results in ciphertexts that preserve the relative order of the underlying plaintexts thus allowing range and comparison queries to be performed directly on ciphertexts. Recently, Popa et al. (SP 2013) gave the first construction of an ideally-secure OPE scheme and Kerschbaum (CCS 2015) showed how to achieve the even stronger notion of frequency-hiding OPE. However, as Naveed et al. (CCS 2015) have recently demonstrated, these constructions remain vulnerable to several attacks. Additionally, all previous ideal OPE schemes (with or without frequency-hiding) either require a large round complexity of O(log n) rounds for each insertion, or a large persistent client storage of size O(n), where n is the number of items in the database. It is thus desirable to achieve a range query scheme addressing both issues gracefully. In this paper, we propose an alternative approach to range queries over encrypted data that is optimized to support insert-heavy workloads as are common in "big data" applications while still maintaining search functionality and achieving stronger security. Specifically, we propose a new primitive called partial order preserving encoding (POPE) that achieves ideal OPE security with frequency hiding and also leaves a sizable fraction of the data pairwise incomparable. Using only O(1) persistent and O(ne) non-persistent client storage for 0(1-e)) search queries. This improved security and performance makes our scheme better suited for today's insert-heavy databases.