Biblio
Efficiently searchable and easily deployable encryption schemes enable an untrusted, legacy service such as a relational database engine to perform searches over encrypted data. The ease with which such schemes can be deployed on top of existing services makes them especially appealing in operational environments where encryption is needed but it is not feasible to replace large infrastructure components like databases or document management systems. Unfortunately all previously known approaches for efficiently searchable and easily deployable encryption are vulnerable to inference attacks where an adversary can use knowledge of the distribution of the data to recover the plaintext with high probability. We present a new efficiently searchable, easily deployable database encryption scheme that is provably secure against inference attacks even when used with real, low-entropy data. We implemented our constructions in Haskell and tested databases up to 10 million records showing our construction properly balances security, deployability and performance.
Encrypting Internet communications has been the subject of renewed focus in recent years. In order to add end-to-end encryption to legacy applications without losing the convenience of full-text search, ShadowCrypt and Mimesis Aegis use a new cryptographic technique called "efficiently deployable efficiently searchable encryption" (EDESE) that allows a standard full-text search system to perform searches on encrypted data. Compared to other recent techniques for searching on encrypted data, EDESE schemes leak a great deal of statistical information about the encrypted messages and the keywords they contain. Until now, the practical impact of this leakage has been difficult to quantify. In this paper, we show that the adversary's task of matching plaintext keywords to the opaque cryptographic identifiers used in EDESE can be reduced to the well-known combinatorial optimization problem of weighted graph matching (WGM). Using real email and chat data, we show how off-the-shelf WGM solvers can be used to accurately and efficiently recover hundreds of the most common plaintext keywords from a set of EDESE-encrypted messages. We show how to recover the tags from Bloom filters so that the WGM solver can be used with the set of encrypted messages that utilizes a Bloom filter to encode its search tags. We also show that the attack can be mitigated by carefully configuring Bloom filter parameters.