Distributed Coordination of Agents For Air Traffic Flow Management
Abstract:
This project addresses the management of the air traffic system, a cyber-physical sys- tem where the need for a tight connection between the computational algorithms and the physical system is critical to safe, reliable and efficient performance. Indeed, the lack of this tight connection is one of the reasons current systems are overwhelmed by the ever increasing traffic and suffer when there is any deviation from the expected (e.g., changing weather). In this project, we study the impact of agent actions, rewards and interactions on system performance using data from real air traffic systems. The objectives of this project are to:
- Derive reward estimation kernels to augment a new event-based air traffic simulator;
- Analyze the impact of modifying agent actions and rewards; and,
- Demonstrate the effectiveness of selecting agents' actions and rewards with real air traffic data obtained from historical congestion scenarios.
The intellectual merit of this project is in addressing the agent coordination problem in a physical setting by shifting the focus from "how to learn" to "what to learn." This paradigm shift allows us to separate the learning algorithms from the reward functions used to tie those learning systems into physical systems. By exploring agent reward functions that implicitly model agent interactions based on feedback from the real world, we design cyber-physical systems where an agent that learns to optimize its own reward also optimizes the system objective function. The broader Impact of this proposal is in providing new air traffic flow management algorithms that will significantly reduce air traffic congestion. The potential impact can not only be measured in currency ($30-41B yearly loss) but in terms of improved experience by all travelers, providing a significant benefit to society. In addition, the PIs will use this project to train graduate and undergraduate students (i) by developing new courses in multiagent learning for transportation systems; and (ii) by providing summer internship opportunities at NASA Ames Research Center. The results to date include a simulator that allows agent learning approaches, as well as new par- titioning algorithms to reduce the complexity of the problem by selectively including sets of agents in other agents' reward calculation. We introduce a method based on agents that represent aircraft choosing their own ground delay with the intent of minimizing delay within the system. The agents use reinforcement learning in combination with the difference reward (shaped reward) and hard constraints on congestion. This is critical because system performance depends on both congestion and delay, resulting in a difficult multi-objective optimization problem. The proposed approach addresses the congestion through a greedy scheduler and reduces delay through the learning agents. Though simple, this approach would lead to an intractable problem in its natural formulation. In- stead, we introduce agent partitions to dramatically reduce the time complexity by up to 1000x, leading to a 30% increase in performance over the greedy (non-learning) solution. The different sized partitions allow the approach to be tuned based on need: If results need to be computed quickly, a large number of partitions can be used, and if accuracy is critical, a smaller number of partitions can be used. Our results extend to a 14-hour simulation (28 hours from the first takeoff to the last landing aircraft) over the national airspace and incorporate over thirty five thousand agents.
- PowerPoint presentation
- 703.07 KB
- 75 downloads
- Download
- PDF version
- Printer-friendly version