ROAug 24, 2021

Learning to Arbitrate Human and Robot Control using Disagreement between Sub-Policies

Yoojin Oh, Marc Toussaint, Jim Mainprice

arXiv:2108.10634v18.99 citations

Originality Incremental advance

AI Analysis

This addresses teleoperation efficiency for robot manipulator users, but it is incremental as it builds on existing shared control methods with a novel reward mechanism.

The paper tackles the problem of blending human and robot commands in teleoperation by learning an arbitration strategy that allocates control based on decision points, resulting in improved teleoperation performance over direct control across different users.

In the context of teleoperation, arbitration refers to deciding how to blend between human and autonomous robot commands. We present a reinforcement learning solution that learns an optimal arbitration strategy that allocates more control authority to the human when the robot comes across a decision point in the task. A decision point is where the robot encounters multiple options (sub-policies), such as having multiple paths to get around an obstacle or deciding between two candidate goals. By expressing each directional sub-policy as a von Mises distribution, we identify the decision points by observing the modality of the mixture distribution. Our reward function reasons on this modality and prioritizes to match its learned policy to either the user or the robot accordingly. We report teleoperation experiments on reach-and-grasping objects using a robot manipulator arm with different simulated human controllers. Results indicate that our shared control agent outperforms direct control and improves the teleoperation performance among different users. Using our reward term enables flexible blending between human and robot commands while maintaining safe and accurate teleoperation.

View on arXiv PDF

Similar