Biblio
Suppose we are given a large number of sequences on a given alphabet, and an adversary is interested in identifying (de-anonymizing) a specific target sequence based on its patterns. Our goal is to thwart such an adversary by obfuscating the target sequences by applying artificial (but small) distortions to its values. A key point here is that we would like to make no assumptions about the statistical model of such sequences. This is in contrast to existing literature where assumptions (e.g., Markov chains) are made regarding such sequences to obtain privacy guarantees. We relate this problem to a set of combinatorial questions on sequence construction based on which we are able to obtain provable guarantees. This problem is relevant to important privacy applications: from fingerprinting webpages visited by users through anonymous communication systems to linking communicating parties on messaging applications to inferring activities of users of IoT devices.