ROMar 14, 2021

Meta Preference Learning for Fast User Adaptation in Human-Supervisory Multi-Robot Deployments

arXiv:2103.08008v111.612 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of enabling general users to operate multi-robot systems more easily by quickly adapting to their preferences, though it appears incremental as it builds on meta-learning for a specific domain.

The paper tackled the problem of adapting multi-robot system behaviors to human preferences to reduce cognitive load and improve social acceptance, achieving reduced duration and frequency of human interventions in a simulated earthquake disaster scenario.

As multi-robot systems (MRS) are widely used in various tasks such as natural disaster response and social security, people enthusiastically expect an MRS to be ubiquitous that a general user without heavy training can easily operate. However, humans have various preferences on balancing between task performance and safety, imposing different requirements onto MRS control. Failing to comply with preferences makes people feel difficult in operation and decreases human willingness of using an MRS. Therefore, to improve social acceptance as well as performance, there is an urgent need to adjust MRS behaviors according to human preferences before triggering human corrections, which increases cognitive load. In this paper, a novel Meta Preference Learning (MPL) method was developed to enable an MRS to fast adapt to user preferences. MPL based on meta learning mechanism can quickly assess human preferences from limited instructions; then, a neural network based preference model adjusts MRS behaviors for preference adaption. To validate method effectiveness, a task scenario "An MRS searches victims in an earthquake disaster site" was designed; 20 human users were involved to identify preferences as "aggressive", "medium", "reserved"; based on user guidance and domain knowledge, about 20,000 preferences were simulated to cover different operations related to "task quality", "task progress", "robot safety". The effectiveness of MPL in preference adaption was validated by the reduced duration and frequency of human interventions.

View on arXiv PDF

Similar