Visible to the public Meta Preference Learning for Fast User Adaptation in Human-Supervisory Multi-Robot Deployments

TitleMeta Preference Learning for Fast User Adaptation in Human-Supervisory Multi-Robot Deployments
Publication TypeConference Paper
Year of Publication2021
AuthorsHuang, Chao, Luo, Wenhao, Liu, Rui
Conference Name2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Date Publishedsep
KeywordsCollaboration, human factors, Learning systems, multi-robot systems, multiprotocol label switching, Neural networks, policy-based governance, pubcrawl, resilience, Resiliency, robot operating systems security, Safety, security, Training
AbstractAs multi-robot systems (MRS) are widely used in various tasks such as natural disaster response and social security, people enthusiastically expect an MRS to be ubiquitous that a general user without heavy training can easily operate. However, humans have various preferences on balancing between task performance and safety, imposing different requirements onto MRS control. Failing to comply with preferences makes people feel difficult in operation and decreases human willingness of using an MRS. Therefore, to improve social acceptance as well as performance, there is an urgent need to adjust MRS behaviors according to human preferences before triggering human corrections, which increases cognitive load. In this paper, a novel Meta Preference Learning (MPL) method was developed to enable an MRS to fast adapt to user preferences. MPL based on meta learning mechanism can quickly assess human preferences from limited instructions; then, a neural network based preference model adjusts MRS behaviors for preference adaption. To validate method effectiveness, a task scenario "An MRS searches victims in an earthquake disaster site" was designed; 20 human users were involved to identify preferences as "aggressive", "medium", "reserved"; based on user guidance and domain knowledge, about 20,000 preferences were simulated to cover different operations related to "task quality", "task progress", "robot safety". The effectiveness of MPL in preference adaption was validated by the reduced duration and frequency of human interventions.
DOI10.1109/IROS51168.2021.9636515
Citation Keyhuang_meta_2021