Mixed Initiative and Collaborative Learning in Adversarial Environments (Jan 2020)

Submitted by Claire Tomlin on Wed, 01/15/2020 - 12:14pm

PIs: Claire Tomlin (Lead), Shankar Sastry, Xenofon Koutsoukos, Janos Sztipanovits

HARD PROBLEM(S) ADDRESSED: Human Behavior (primary), Resilient Architectures (secondary), and Scalability and Composability (secondary)

We have been developing a framework for incorporating human behavior into resilient robot motion planning. We have been developing scalable, online safety updates of these motion plans

PUBLICATIONS (from the current quarter only; pending publications go in section 2 below)

L. Peters, D. Fridovich-Keil, C. J. Tomlin, Z. N. Sunberg. Inference-Based Global Alignment for General-Sum Differential Games, under review, Nov 2019.
M. Lim, Z. N. Sunberg, C. J. Tomlin. Sparse tree search optimality guarantees in POMDPs with continuous observation spaces, under review, Jan 2020.

KEY HIGHLIGHTS
Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making

In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play is characterized by Nash equilibria. Often, problems admit multiple Nash equilibria. From the perspective of a single agent in such a game, this multiplicity of solutions can introduce uncertainty about how other agents will behave. This paper proposes a general framework for resolving ambiguity between Nash equilibria by reasoning about the equilibrium other agents are aiming for. We demonstrate this framework in simulations of a multi-player human-robot navigation problem that yields two main conclusions: First, by inferring which equilibrium humans are operating at, the robot is able to predict trajectories more accurately, and second, by discovering and aligning itself to this equilibrium the robot is able to reduce the cost for all players [1].

Partially observable Markov decision processes (POMDPs) with continuous state and observation spaces have powerful flexibility for representing real-world decision and control problems, but are notoriously difficult to solve. Recent online sampling-based algorithms that use observation likelihood weighting have shown unprecedented effectiveness in domains with continuous observation spaces. However there has been no formal theoretical justification for this technique. This work offers such a justification, proving that a simplified algorithm, partially observable weighted sparse sampling (POWSS), will estimate Q-values accurately with high probability and can be made to perform arbitrarily near the optimal solution by increasing computational power [2].

COMMUNITY ENGAGEMENTS

Claire Tomlin ran the 5th installment of Berkeley Girls in Engineering (GiE), a program held at UC Berkeley for middle school students, in Summer 2019. The program runs for 4 weeks, with 30 students participating per week, for a total of 120 students each summer. The week long day camp includes modules across all types of engineering, with hands-on experiments, to teach about bioengineering, robotics, material science, computer science, water treatment, concrete design, and a range of other engineering topics. Students team up to complete a poster about an engineering problem and how they would solve it, presented at the end of the week to the camp and family members.

EDUCATIONAL ADVANCES

We are developing a new course in systems theory at Berkeley, to be taken by upper level undergraduates and first and second year graduate students, on a rapprochement between control theory and reinforcement learning. The course will focus on a modern viewpoint on modeling, analysis, and control design, leveraging tools and successes from both systems and control theory and machine learning. The first version of this course will be taught by Shankar Sastry in Spring 2020.

Groups:

Cyber-Physical Systems Virtual Organization

Read-only archive of site from September 29, 2023.

Mixed Initiative and Collaborative Learning in Adversarial Environments (Jan 2020)