Mixed Initiative and Collaborative Learning in Adversarial Environments (Jan-Jun 2021)
PIs: Claire Tomlin (Lead), Shankar Sastry, Xenofon Koutsoukos, Janos Sztipanovits
Reporting Period: (1/1/2021 – 6/30/2021)
Hard Problems Addressed
Human Behavior (primary)
Resilient Architectures (secondary)
Scalability and Composability (secondary)
In the two quarters Jan-March 2021, April-June 2021, our focus has been on trying to analyze and design attacks on learning in multi=player games. There are two sets of papers: the first on new attacks and defenses of Recurrent Neural Networks, and POMDPs. This work has been carried out by researchers Shankar Deka and Sampada Deglurkar in Professor Claire Tomlin’s group. In Professor Sastry’s group, Victoria Tuck has been working on identifying new vulnerabilities introduced by incentive design for optimal utilization of a scarce resource in multi-player settings (attacks on mechanism design in multi=player decentralized games).
Publications
[1] Shankar A Deka, Dušan M Stipanović, and Claire J Tomlin. Feedback-Control Based Adversarial Attacks on Recurrent Neural Network. In 59th IEEE Conference on Decision and Control (CDC), pages 4677–4682, 2020. https://doi.org/10.1109/CDC42340.2020.9303949.
[2] Shankar A Deka, Dušan M Stipanović, and Claire J Tomlin. Dynamically Computing Adversarial Perturbations for Recurrent Neural Networks. Submitted to IEEE Transactions on Control Systems Technology https://arxiv.org/abs/2009.02874.
Adversarial attacks in the context of Recurrent Neural Network (RNN) based sequence classifiers involve carefully crafting very small perturbations to a nominal input, in order to fool the network into misclassifying the input signal. In this project, the main goal is to find analytical conditions under which RNNs are susceptible to such adversarially constructed input perturbations. By taking a dynamical system theoretic approach, the problem of finding adversarial perturbations is reframed as a control synthesis problem, with the disturbances viewed as control inputs. Stability and robustness of RNNs can then be studied using the well-developed tools from control theory. Concretely, we obtain sufficient conditions in terms of the trainable parameters of the RNN under which adversarial perturbations exist, and show how such perturbations can be dynamically constructed as a feedback control, by leveraging the sequential nature of RNNs. Our algorithm scales gracefully with the length of the nominal input sequence, and can be deployed for real time attacks. However, given our controls framework, perturbations may also be designed using off-the-shelf tools from optimal control. This work has been tested on a wide variety of sequence classification examples, including tasks like sentiment classification from the NLP domain.
[3] Victoria Tuck, Yash Pant and Shankar Sastry “Decentralized Path Planning for Moving Obstacles and Multi-Agent Systems”. To appear in the 2021 IEEE Conference on Control Theory and Applications, CCTA, San Diego, August 2021
A canonical problem for unmanned vehicles (ground, air and water borne) is to provide provably safe algorithms for not running into moving obstacles, including other agents. A huge number of probabilistic complete algorithms called RRT* has been proposed for this purpose. But they are basically unusable in practice for such elementary applications as driving in cluttered environments because they are centralized and kinematic (that is that they are based on computational geometry rather than dynamics of the agents). This glaring omission has resulted in problems for air space management (sometimes called UAS for UAVs), and also for the certification of driving cars. The resilience aspect of these algorithms is that they need to be designed to be robust to adversarial attack by rogue vehicles. The most common approach here has been a game theoretic approach with inverse reinforcement learning to determine adversarial intent. The practical problems here are these solutions are usually so conservative as to be useless in real world scenarios. A possible way around this is the use of Model Predictive and Learning Games.
Key Highlights
Resilience and the Science of Security for multi-agent systems is certainly the hot area in a number of fields right now: in areas straddling AI/ML, control theory and robotics. The work on aggregative games, model predictive learning applied to multi-player games seems to provide he best set of underpinning tools for this purpose. We see this area growing tremendously importantly in importance.
Community Engagement
Shankar Sastry launched a new Institute entitled the C3 Digital Transformation Institute (https://c3dti.ai) a partnership of Berkeley, UIUC (co-leads) with U Chicago, CMU, MIT, Princeton, Stanford, Royal Institute of Technology to develop the science and technology of Digital Transformation. The private philanthropy that supports this institute was very much leveraged on the support of Federal research such as this SoS lablet. We have been furthering the agenda of SoS in the workshops that this institute has run in the Spring see https://c3dti.ai/events/workshops Two workshops one in March 2021 and the other in May 2021 focused on new vulnerabilities which are introduced through the introduction of learning algorithms in multi-player systems.
Educational Advances
Professor Tomlin and Sastry have taken the lead in revamping large amounts of the undergraduate and graduate curriculum to feaure the recent confluence of AI/ML, robotics, and control. In the Fall, Sastry taught his course on Introduction to Robotics (a mezzanine course for undergrads and Masters students) for about 150 students (see https://ucb-ee106.github.io/106a-fa20site/) for the resources associated with this class. In a partnership with OSD’s National Security Innovation Network (NSIN) we are placing the top students from this class at various labs in the DoD in Summer 2021 (this was piloted successfully in Summer 2020, see https://blumcenter.berkeley.edu/news-posts/national-security-innovation-with-uc-berkeley/
In Spring 2021 Sastry taught the second term version of this class EECS 106B/206B which will include the research results from both of the projects discussed above. Also, Profs. Ma, Malik, Sastry and Tomlin collaborated in teaching a new graduate class combining vision, deep learning and control EECS 290-5. These three courses are all part of the revamping of the curriculum mentioned above.