Visible to the public SaTC: CORE: Frontier: Collaborative: End-to-End Trustworthiness of Machine-Learning SystemsConflict Detection Enabled

Project Details

Performance Period

Oct 01, 2018 - Sep 30, 2023

Institution(s)

University of California-San Diego

Award Number


This frontier project establishes the Center for Trustworthy Machine Learning (CTML), a large-scale, multi-institution, multi-disciplinary effort whose goal is to develop scientific understanding of the risks inherent to machine learning, and to develop the tools, metrics, and methods to manage and mitigate them. The center is led by a cross-disciplinary team developing unified theory, algorithms and empirical methods within complex and ever-evolving ML approaches, application domains, and environments. The science and arsenal of defensive techniques emerging within the center will provide the basis for building future systems in a more trustworthy and secure manner, as well as fostering a long term community of research within this essential domain of technology. The center has a number of outreach efforts, including a massive open online course (MOOC) on this topic, an annual conference, and broad-based educational initiatives. The investigators continue their ongoing efforts at broadening participation in computing via a joint summer school on trustworthy ML aimed at underrepresented groups, and by engaging in activities for high school students across the country via a sequence of webinars advertised through the She++ network and other organizations.

The center focuses on three interconnected and parallel investigative directions that represent the different classes of attacks attacking ML systems: inference attacks, training attacks, and abuses of ML. The first direction explores inference time security, namely methods to defend a trained model from adversarial inputs. This effort emphasizes developing formally grounded measurements of robustness against adversarial examples (defenses), as well as understanding the limits and costs of attacks. The second research direction aims to develop rigorously grounded measures of robustness to attacks that corrupt the training data and new training methods that are robust to adversarial manipulation. The final direction tackles the general security implications of sophisticated ML algorithms including the potential abuses of generative ML models, such as models that generate (fake) content, as well as data mechanisms to prevent the theft of a machine learning model by an adversary who interacts with the model.