Biblio
At the first Information Hiding Workshop in 1996 we tried to clarify the models and assumptions behind information hiding. We agreed the terminology of cover text and stego text against a background of the game proposed by our keynote speaker Gus Simmons: that Alice and Bob are in jail and wish to hatch an escape plan without the fact of their communication coming to the attention of the warden, Willie. Since then there have been significant strides in developing technical mechanisms for steganography and steganalysis, with new techniques from machine learning providing ever more powerful tools for the analyst, such as the ensemble classifier. There have also been a number of conceptual advances, such as the square root law and effective key length. But there always remains the question whether we are using the right security metrics for the application. In this talk I plan to take a step backwards and look at the systems context. When can stegosystems actually be used? The deployment history is patchy, with one being Trucrypt's hidden volumes, inspired by the steganographic file system. Image forensics also find some use, and may be helpful against some adversarial machine learning attacks (or at least help us understand them). But there are other contexts in which patterns of activity have to be hidden for that activity to be effective. I will discuss a number of examples starting with deception mechanisms such as honeypots, Tor bridges and pluggable transports, which merely have to evade detection for a while; then moving on to the more challenging task of designing deniability mechanisms, from leaking secrets to a newspaper through bitcoin mixes, which have to withstand forensic examination once the participants come under suspicion. We already know that, at the system level, anonymity is hard. However the increasing quantity and richness of the data available to opponents may move a number of applications from the deception category to that of deniability. To pick up on our model of 20 years ago, Willie might not just put Alice and Bob in solitary confinement if he finds them communicating, but torture them or even execute them. Changing threat models are historically one of the great disruptive forces in security engineering. This leads me to suspect that a useful research area may be the intersection of deception and forensics, and how information hiding systems can be designed in anticipation of richer and more complex threat models. The ever-more-aggressive censorship systems deployed in some parts of the world also raise the possibility of using information hiding techniques in censorship circumvention. As an example of recent practical work, I will discuss Covertmark, a toolkit for testing pluggable transports that was partly inspired by Stirmark, a tool we presented at the second Information Hiding Workshop twenty years ago.
Dynamic Searchable Symmetric Encryption (DSSE) allows a client to perform keyword searches over encrypted files via an encrypted data structure. Despite its merits, DSSE leaks search and update patterns when the client accesses the encrypted data structure. These leakages may create severe privacy problems as already shown, for example, in recent statistical attacks on DSSE. While Oblivious Random Access Memory (ORAM) can hide such access patterns, it incurs significant communication overhead and, therefore, it is not yet fully practical for cloud computing systems. Hence, there is a critical need to develop private access schemes over the encrypted data structure that can seal the leakages of DSSE while achieving practical search/update operations. In this paper, we propose a new oblivious access scheme over the encrypted data structure for searchable encryption purposes, that we call textlessutextgreaterDtextless/utextgreateristributed textlessutextgreaterOtextless/utextgreaterblivious textlessutextgreaterDtextless/utextgreaterata structure textlessutextgreaterDSSEtextless/utextgreater (DOD-DSSE). The main idea is to create a distributed encrypted incidence matrix on two non-colluding servers such that no arbitrary queries on these servers can be linked to each other. This strategy prevents not only recent statistical attacks on the encrypted data structure but also other potential threats exploiting query linkability. Our security analysis proves that DOD-DSSE ensures the unlink-ability of queries and, therefore, offers much higher security than traditional DSSE. At the same time, our performance evaluation demonstrates that DOD-DSSE is two orders of magnitude faster than ORAM-based techniques (e.g., Path ORAM), since it only incurs a small-constant number of communication overhead. That is, we deployed DOD-DSSE on geographically distributed Amazon EC2 servers, and showed that, a search/update operation on a very large dataset only takes around one second with DOD-DSSE, while it takes 3 to 13 minutes with Path ORAM-based methods.