Mixed Initiative and Collaborative Learning in Adversarial Environments (October 2020 and January 2021)

Submitted by willirn1 on Mon, 01/11/2021 - 12:23pm

PIs: Claire Tomlin (Lead), Shankar Sastry, Xenofon Koutsoukos, Janos Sztipanovits

HARD PROBLEM(S) ADDRESSED: Human Behavior (primary), Resilient Architectures (secondary), and Scalability and Composability (secondary)

In the two quarters June-Sept 2020 and Oct-Dec 2020, my focus has been on trying to analyze and design learning in multi=player games. This is related to earlier work that we reported in 2019 and early 2020 where we showed the problems of converging to Nash equilibria in multi-layer games when gradient descent was used to update the policies. There are two papers which have not yet been submitted which are working on and I will describe them below. The general theme is to understand how to incentivize utilization of a scarce resource in multi-player settings (mechanism design in multi=player decentralized games).

PUBLICATIONS (from the current quarter only; pending publications go in section 2 below)

Kshitij Kulkarni, Victoria Tuck, Chinmay Maheshwari, and Shankar Sastry, “The Fremont Problem: Setting Speeds and Tolls to incentivize better utilization of congested roads”, (preprint to be submitted February 2021).
In routing games, vehicles choose paths to minimize their own travel time between their origin and their destination. As a result, some local roads near congested highways get higher flow than that which they are designed to sustain, which creates negative externalities. In this article, we consider the case where a city has the ability to influence traffic by decreasing or increasing the speed limit on city roads. We show that a decrease in speed limit can be interpreted as a Pigovian tax with imperfect differentiation in routing games (with travel time being interpreted as a price to pay to travel). Decreasing speed limits inside the city can improve the city's social welfare by incentivizing cut-through travelers to not use local roads. A trade-off is exhibited between residential accessibility (increasing the speed limit) and residential safety (decreasing the speed limit to decrease flow on local roads) in this benchmark framework. This article shows that decreasing speed limits is not optimal because it uniformly taxes every vehicle (residents and cut-through travelers), but that a route-based pricing is optimal as it enables one to specifically tax the flow responsible for the externality.
Victoria Tuck, Yash Pant and Shankar Sastry “Decentralized Path Planning for Moving Obstacles and Multi-Agent Systems”.
A canonical problem for unmanned vehicles (ground, air and water borne) is to provide provably safe algorithms for not running into moving obstacles, including other agents. A huge number of probabilistic complete algorithms called RRT* has been proposed for this purpose. But they are basically unusable in practice for such elementary applications as driving in cluttered environments because they are centralized and kinematic (that is that they are based on computational geometry rather than dynamics of the agents). This glaring omission has resulted in problems for air space management (sometimes called UAS for UAVs), and also for the certification of driving cars. The resilience aspect of these algorithms is that they need to be designed to be robust to adversarial attack by rogue vehicles. The most common approach here has been a game theoretic approach with inverse reinforcement learning to determine adversarial intent. The practical problems here are these solutions are usually so conservative as to be useless in real world scenarios. A possible way around this is the use of Model Predictive and Learning Games, We expect this project to be a long standing one with several papers to follow.

KEY HIGHLIGHTS
Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making

Resilience and the Science of Security for multi-agent systems is certainly the hot area in a number of fields right now: in areas straddling AI/ML, control theory and robotics. The work on aggregative games, model predictive learning applied to multi-player games seems to provide he best set of underpinning tools for this purpose.

COMMUNITY ENGAGEMENTS

Shankar Sastry launched a new Institute entitled the C3 Digital Transformation Institute (https://c3dti.ai) a partnership of Berkeley, UIUC (co-leads) with UChicago, CMU, MIT, Princeton, Stanford to develop the science and technology of Digital Transformation. The private philanthropy that supports this institute was very much leveraged on the support of Federal research such as this SoS lablet. We have been furthering the agenda of SoS in the workshops that this institute has run in the Fall, see https://c3dti.ai/events/workshops Some spectacular connections between the spread of fake news, wireless networks, and pandemic spread was reported in the first workshop in September 2020 . The third workshop in December 2020 featured a tour de force of new methods in Robust and Provably Safe Autonomy co-organized by Professor Claire Tomlin.

EDUCATIONAL ADVANCES

Professor Tomlin and Sastry have taken the lead in revamping large amounts of the undergraduate and graduate curriculum to feature the recent confluence of AI/ML, robotics, and control. In the Fall, Sastry taught his course on Introduction to Robotics (a mezzanine course for undergrads and Masters students) for about 150 students (see https://ucb-ee106.github.io/106a-fa20site/) for the resources associated with this class. In a partnership with OSD’s National Security Innovation Network (NSIN) we are placing the top students from this class at various labs in the DoD in Summer 2021 (this was piloted successfully in Summer 2020, see https://blumcenter.berkeley.edu/news-posts/national-security-innovation-with-uc-berkeley/
This term, Sastry is teaching the second term version of this class EECS 106B/206B which will include the research results from both of the projects discussed above. Also, Profs. Ma, Malik, Sastry and Tomlin are collaborating in teaching a new graduate class combining vision, deep learning and control EECS 290-5. These three courses are all part of the revamping of the curriculum mentioned above.

Groups:

Cyber-Physical Systems Virtual Organization

Read-only archive of site from September 29, 2023.

Mixed Initiative and Collaborative Learning in Adversarial Environments (October 2020 and January 2021)