Mixed Initiative and Collaborative Learning in Adversarial Environments (Jul-Sept 2021 copy)
PIs: Claire Tomlin (Lead), Shankar Sastry, Xenofon Koutsoukos, Janos Sztipanovits
Reporting Period: (7/1/2021 – 12/31/2021)
Hard Problems Addressed
Resilient Architectures (primary)
Scalability and Composability (secondary)
In the two quarters July-Sept 2021 and Oct-Dec 2021, our focus has been on developing resilient control structures which use learning mechanisms. We have worked on techniques to better understand vulnerabilities and therefore robust design of Recurrent Neural Networks (RNNs), robust methods for dealing with time delays in cyber-physical systems (CPSs), and planning in uncertain environments which have high dimensional observations spaces. In addition, we have been developing a method, called Lyapunov Density Models, which attempts to constrain learning agent to states and actions that resemble those in the training distribution.
Publications
[1] S. Deka, D. Lee, and C. J. Tomlin, “Towards cyber-physical systems robust to communication delays: A differential game approach", IEEE Control Systems Letters, 2021, Accepted for publication.
[2] S. Deka, D. Stipanovic, C. J. Tomlin, “Dynamically Computing Adversarial Perturbations for Recurrent Neural Networks”, IEEE Transactions on Control Systems Technology, 2021, conditionally accepted.
[3] S. Deglurkar, M. H. Lim, J. Tucker, Z. N. Sunberg, A. Faust, C. J. Tomlin, “Visual Learning-based Planning for Continuous High-Dimensional POMDPs”, Submitted to the International Conference on Automated Planning and Scheduling (ICAPS), December 2022.
Key Highlights
Convolutional and recurrent neural networks have been widely employed to achieve state-of-the-art performance on classification tasks. However, it has also been noted that these networks can be manipulated adversarially with relative ease, by carefully crafted additive perturbations to the input. Though several experimentally established prior works exist on crafting and defending against attacks, it is also desirable to have rigorous theoretical analyses to illuminate conditions under which such adversarial inputs exist. Our work in using a control structure to dynamically compute adversarial perturbations for RNNs provides both the theory and supporting experiments for real-time attacks. The focus is specifically on recurrent architectures and inspiration is drawn from dynamical systems theory to naturally cast this as a control problem, allowing dynamic computation of adversarial perturbations at each timestep of the input sequence, thus resembling a feedback controller. We have worked on a number of examples in Human Activity Recognition (HAR) and sentiment analysis in text.
Collaboration between interconnected cyber-physical systems is becoming increasingly pervasive. Time-delays in communication channels between such systems are known to induce catastrophic failure modes, like high frequency oscillations in robotic manipulators in bilateral teleoperation or string instability in platoons of autonomous vehicles. This work considers nonlinear time-delay systems representing coupled robotic agents, and proposes controllers that are robust to time-varying communication delays. We introduce approximations that allow the delays to be considered as implicit control inputs themselves, and formulate the problem as a zero-sum differential game between the stabilizing controllers and the delays acting adversarially. The ensuing optimal control law is finally compared to known results from Lyapunov-Krasovskii based approaches via numerical experiments.
The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle very high-dimensional observations they often encounter in the real world (e.g. image observations in robotic domains). In this work, we propose Visual Tree Search (VTS), a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner. We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train. This new approach outperforms a baseline state-of-the-art on-policy planning algorithm while using significantly less offline training time.
Community Engagement
December 2021: Claire Tomlin participated in the NeurIPS Workshop, “Safe and Robust Control of Uncertain Systems” (https://neurips.cc/virtual/2021/workshop/21843) as a panelist, and gave a talk at the NeurIPS workshop “Machine Learning for Autonomous Driving” (https://neurips.cc/virtual/2021/workshop/21855).
October 2021: Claire Tomlin gave a seminar in the online series Games, Decisions & Networks, called “Modeling Other Agents” (abstract below).
One of the biggest challenges in the design of autonomous systems is to effectively predict what other agents in the vicinity will do. If the goal is to preserve safety, the assumption that other agents take their most unsafe action leads to a zero-sum game formulation. In collision avoidance applications, for example, if I make this assumption about other agents, and then manage to stay outside of their reachable sets, then I am guaranteed to be safe. In this talk, we explore how this worst case assumption may be relaxed, and present game-theoretic motion planning results which use feedback Nash equilibrium problems to model interaction between agents. We demonstrate our results on both simulations and experiments of multiple vehicle scenarios.
October 2021: Claire Tomlin gave a keynote talk at Embedded Systems Week (ESWeek) 2021, entitled “Safe Learning in Robotics” (abstract below).
In many applications of autonomy in robotics, guarantees that constraints are satisfied throughout the learning process are paramount. We present a controller synthesis technique based on the computation of reachable sets, using optimal control and game theory. Then, we present methods for combining reachability with learning-based methods, to enable performance improvement while maintaining safety and to move towards safe robot control with learned models of the dynamics and the environment. We will illustrate these “safe learning” methods on robotic platforms at Berkeley, including demonstrations of motion planning around people, and navigating in a priori unknown environments.
March 2020 to present: Shankar Sastry launched a new Institute entitled the C3 Digital Transformation Institute (https://c3dti.ai) a partnership of Berkeley, UIUC (co-leads) with U Chicago, CMU, MIT, Princeton, Stanford, Royal Institute of Technology to develop the science and technology of Digital Transformation. The private philanthropy that supports this institute was very much leveraged on the support of Federal research such as this SoS lablet. We have been furthering the agenda of SoS in the workshops that this institute has run in the Spring see https://c3dti.ai/events/workshops. In the reporting period we held two workshops, one on Networks of Machine Learning, for Machine Learning, by Machine Learning (September 22–24, 2021), and a second on Digital Transformation of the Built Environment (October 26 & 28, 2021).