Mixed Initiative and Collaborative Learning in Adversarial Environments (Oct 2019)

Submitted by Katie Dey on Mon, 09/30/2019 - 8:57am

PIs: Claire Tomlin (Lead), Shankar Sastry, Xenofon Koutsoukos, Janos Sztipanovits

HARD PROBLEM(S) ADDRESSED: Human Behavior (primary), Resilient Architectures (secondary), and Scalability and Composability (secondary)

We have been developing a framework for incorporating human behavior into resilient robot motion planning. We have been using reachability analysis to develop scalable, online safety updates of these motion plans. We have been developing scalable, analyzable methods for learning unknown dynamics within the framework of feedback linearization.

PUBLICATIONS (from the current quarter only; pending publications go in section 2 below)

A. Bajcsy*, S. Bansal*, E. Bronstein, V. Tolani, C.J. Tomlin. An Efficient Reachability-Based Framework for Provably Safe Autonomous Navigation in Unknown Environments. Conference on Decision and Control (CDC), 2019.
A. Bajcsy*, S.L. Herbert*, D. Fridovich-Keil, J.F. Fisac, S. Deglurkar, A.D. Dragan, C.J. Tomlin. A Scalable Framework For Real-Time Multi-Robot, Multi-Human Collision Avoidance. International Conference on Robotics and Automation (ICRA), 2019.
D. Fridovich-Keil*, A. Bajcsy*, J.F. Fisac, S.L. Herbert, S. Wang, A.D. Dragan, C.J. Tomlin. Confidence-aware motion prediction for real-time collision avoidance. International Journal of Robotics Research (IJRR), 2019.

KEY HIGHLIGHTS
Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making

Key to the success of robot motion planning in environments with humans is the ability to design plans that are robust to human behavior. In order to safely operate around humans, robots can employ predictive models of human motion. Unfortunately, these models cannot capture the full complexity of human behavior and necessarily introduce simplifying assumptions. As a result, predictions may degrade whenever the observed human behavior departs from the assumed structure, which can have negative implications for safety. In this paper, we observe that how "rational" human actions appear under a particular model can be viewed as an indicator of that model's ability to describe the human's current motion. By reasoning about this model confidence in a real-time Bayesian framework, we show that the robot can very quickly modulate its predictions to become more uncertain when the model performs poorly. Building on recent work in provably-safe trajectory planning, we leverage these confidence-aware human motion predictions to generate assured autonomous robot motion. Our new analysis combines worst-case tracking error guarantees for the physical robot with probabilistic time-varying human predictions, yielding a quantitative, probabilistic safety certificate. We demonstrate our approach with two quadcopter navigating around two human pedestrians in the lab, and show how this scales to multiple humans and robots in simulated environments [2,3].
Current safety analysis tools enable autonomous systems to reason about safety given full information about the state of the environment a priori. However, these tools do not scale well to scenarios where the environment is being sensed in real time, such as during navigation tasks. In this work, we propose a novel, real-time safety analysis method based on Hamilton-Jacobi reachability that provides strong safety guarantees despite environment uncertainty. Our safety method is planner-agnostic and provides guarantees for a variety of mapping sensors. We demonstrate our approach in simulation and in hardware to provide safety guarantees around a state-of-the-art vision-based, learning-based planner [1].
We present a novel approach to control design fornonlinear systems, which leverages reinforcement learning tech-niques to learn a linearizing controller for a physical plant withunknown dynamics. Feedback linearization is a technique fromnonlinear control which renders the input-output dynamics ofa nonlinear plantlinearunder application of an appropriatefeedback controller. Once a linearizing controller has beenconstructed, desired output trajectories for the nonlinear plantcan be tracked using a variety of linear control techniques.A single learned policy then serves to track arbitrary desiredreference signals provided by a higher-level planner. We presenttheoretical results which provide conditions under which thelearning problem has a unique solution which exactly linearizesthe plant. We demonstrate the performance of our approachon two simulated problems and a physical robotic platform.For the simulated environments, we observe that the learnedfeedback linearizing policies can achieve arbitrary tracking ofreference trajectories for a fully actuated double pendulum anda 14 dimensional quadrotor. In hardware, we demonstrate thatour approach significantly improves tracking performance ona 7-DOF Baxter robot after less than two hours of training [Pending, 2].

COMMUNITY ENGAGEMENTS

Claire Tomlin ran the 5th installment of Berkeley Girls in Engineering (GiE), a program held at UC Berkeley for middle school students, in Summer 2019. The program runs for 4 weeks, with 30 students participating per week, for a total of 120 students each summer. The week long day camp includes modules across all types of engineering, with hands-on experiments, to teach about bioengineering, robotics, material science, computer science, water treatment, concrete design, and a range of other engineering topics. Students team up to complete a poster about an engineering problem and how they would solve it, presented at the end of the week to the camp and family members.

EDUCATIONAL ADVANCES

We are developing a new course in systems theory at Berkeley, to be taken by upper level undergraduates and first and second year graduate students, on a rapprochement between control theory and reinforcement learning. The course will focus on a modern viewpoint on modeling, analysis, and control design, leveraging tools and successes from both systems and control theory and machine learning. The first version of this course will be taught by Shankar Sastry in Spring 2020.

Groups:

Cyber-Physical Systems Virtual Organization

Read-only archive of site from September 29, 2023.

Mixed Initiative and Collaborative Learning in Adversarial Environments (Oct 2019)