Visible to the public Biblio

Filters: Author is Roth, Aaron  [Clear All Filters]
2023-01-06
Golatkar, Aditya, Achille, Alessandro, Wang, Yu-Xiang, Roth, Aaron, Kearns, Michael, Soatto, Stefano.  2022.  Mixed Differential Privacy in Computer Vision. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :8366—8376.
We introduce AdaMix, an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data. While pre-training language models on large public datasets has enabled strong differential privacy (DP) guarantees with minor loss of accuracy, a similar practice yields punishing trade-offs in vision tasks. A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset. AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off. AdaMix reduces the error increase from the non-private upper bound from the 167–311% of the baseline, on average across 6 datasets, to 68-92% depending on the desired privacy level selected by the user. AdaMix tackles the trade-off arising in visual classification, whereby the most privacy sensitive data, corresponding to isolated points in representation space, are also critical for high classification accuracy. In addition, AdaMix comes with strong theoretical privacy guarantees and convergence analysis.
2020-10-05
Joseph, Matthew, Mao, Jieming, Neel, Seth, Roth, Aaron.  2019.  The Role of Interactivity in Local Differential Privacy. 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS). :94—105.

We study the power of interactivity in local differential privacy. First, we focus on the difference between fully interactive and sequentially interactive protocols. Sequentially interactive protocols may query users adaptively in sequence, but they cannot return to previously queried users. The vast majority of existing lower bounds for local differential privacy apply only to sequentially interactive protocols, and before this paper it was not known whether fully interactive protocols were more powerful. We resolve this question. First, we classify locally private protocols by their compositionality, the multiplicative factor by which the sum of a protocol's single-round privacy parameters exceeds its overall privacy guarantee. We then show how to efficiently transform any fully interactive compositional protocol into an equivalent sequentially interactive protocol with a blowup in sample complexity linear in this compositionality. Next, we show that our reduction is tight by exhibiting a family of problems such that any sequentially interactive protocol requires this blowup in sample complexity over a fully interactive compositional protocol. We then turn our attention to hypothesis testing problems. We show that for a large class of compound hypothesis testing problems - which include all simple hypothesis testing problems as a special case - a simple noninteractive test is optimal among the class of all (possibly fully interactive) tests.

2020-03-10
Dwork, Cynthia, Roth, Aaron.  2014.  The Algorithmic Foundations of Differential Privacy. Found. Trends Theor. Comput. Sci.. 9:211–407.

The problem of privacy-preserving data analysis has a long history spanning multiple disciplines. As electronic data about individuals becomes increasingly detailed, and as technology enables ever more powerful collection and curation of these data, the need increases for a robust, meaningful, and mathematically rigorous definition of privacy, together with a computationally rich class of algorithms that satisfy this definition. Differential Privacy is such a definition.After motivating and discussing the meaning of differential privacy, the preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example. A key point is that, by rethinking the computational goal, one can often obtain far better results than would be achieved by methodically replacing each step of a non-private computation with a differentially private implementation. Despite some astonishingly powerful computational results, there are still fundamental limitations — not just on what can be achieved with differential privacy but on what can be achieved with any method that protects against a complete breakdown in privacy. Virtually all the algorithms discussed herein maintain differential privacy against adversaries of arbitrary computational power. Certain algorithms are computationally intensive, others are efficient. Computational complexity for the adversary and the algorithm are both discussed.We then turn from fundamentals to applications other than queryrelease, discussing differentially private methods for mechanism design and machine learning. The vast majority of the literature on differentially private algorithms considers a single, static, database that is subject to many analyses. Differential privacy in other models, including distributed databases and computations on data streams is discussed.Finally, we note that this work is meant as a thorough introduction to the problems and techniques of differential privacy, but is not intended to be an exhaustive survey — there is by now a vast amount of work in differential privacy, and we can cover only a small portion of it.

2017-10-10
Cummings, Rachel, Ligett, Katrina, Radhakrishnan, Jaikumar, Roth, Aaron, Wu, Zhiwei Steven.  2016.  Coordination Complexity: Small Information Coordinating Large Populations. Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science. :281–290.

We initiate the study of a quantity that we call coordination complexity. In a distributed optimization problem, the information defining a problem instance is distributed among n parties, who need to each choose an action, which jointly will form a solution to the optimization problem. The coordination complexity represents the minimal amount of information that a centralized coordinator, who has full knowledge of the problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimal solution. We show that upper bounds on the coordination complexity of a problem imply the existence of good jointly differentially private algorithms for solving that problem, which in turn are known to upper bound the price of anarchy in certain games with dynamically changing populations. We show several results. We fully characterize the coordination complexity for the problem of computing a many-to-one matching in a bipartite graph. Our upper bound in fact extends much more generally to the problem of solving a linearly separable convex program. We also give a different upper bound technique, which we use to bound the coordination complexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching.

2017-03-07
Hsu, Justin, Morgenstern, Jamie, Rogers, Ryan, Roth, Aaron, Vohra, Rakesh.  2016.  Do Prices Coordinate Markets? Proceedings of the Forty-eighth Annual ACM Symposium on Theory of Computing. :440–453.

Walrasian equilibrium prices have a remarkable property: they allow each buyer to purchase a bundle of goods that she finds the most desirable, while guaranteeing that the induced allocation over all buyers will globally maximize social welfare. However, this clean story has two caveats. * First, the prices may induce indifferences. In fact, the minimal equilibrium prices necessarily induce indifferences. Accordingly, buyers may need to coordinate with one another to arrive at a socially optimal outcome—the prices alone are not sufficient to coordinate the market. * Second, although natural procedures converge to Walrasian equilibrium prices on a fixed population, in practice buyers typically observe prices without participating in a price computation process. These prices cannot be perfect Walrasian equilibrium prices, but instead somehow reflect distributional information about the market. To better understand the performance of Walrasian prices when facing these two problems, we give two results. First, we propose a mild genericity condition on valuations under which the minimal Walrasian equilibrium prices induce allocations which result in low over-demand, no matter how the buyers break ties. In fact, under genericity the over-demand of any good can be bounded by 1, which is the best possible at the minimal prices. We demonstrate our results for unit demand valuations and give an extension to matroid based valuations (MBV), conjectured to be equivalent to gross substitute valuations (GS). Second, we use techniques from learning theory to argue that the over-demand and welfare induced by a price vector converge to their expectations uniformly over the class of all price vectors, with respective sample complexity linear and quadratic in the number of goods in the market. These results make no assumption on the form of the valuation functions. These two results imply that under a mild genericity condition, the exact Walrasian equilibrium prices computed in a market are guaranteed to induce both low over-demand and high welfare when used in a new market where agents are sampled independently from the same distribution, whenever the number of agents is larger than the number of commodities in the market.