LG AI ROMay 16, 2023

Continually Learned Pavlovian Signalling Without Forgetting for Human-in-the-Loop Robotic Control

Adam S. R. Parker, Michael R. Dawson, Patrick M. Pilarski

arXiv:2305.14365v12.0

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of making robotic prostheses more intuitive for users by improving adaptive feedback, though it is incremental as it builds on existing prediction learning methods.

The study tackled the problem of forgetting previously learned predictions in Pavlovian signalling for robotic prosthetic control, finding that a look-ahead state representation enabled learning without forgetting, contrary to expectations about off-policy methods.

Artificial limbs are sophisticated devices to assist people with tasks of daily living. Despite advanced robotic prostheses demonstrating similar motion capabilities to biological limbs, users report them difficult and non-intuitive to use. Providing more effective feedback from the device to the user has therefore become a topic of increased interest. In particular, prediction learning methods from the field of reinforcement learning -- specifically, an approach termed Pavlovian signalling -- have been proposed as one approach for better modulating feedback in prostheses since they can adapt during continuous use. One challenge identified in these learning methods is that they can forget previously learned predictions when a user begins to successfully act upon delivered feedback. The present work directly addresses this challenge, contributing new evidence on the impact of algorithmic choices, such as on- or off-policy methods and representation choices, on the Pavlovian signalling from a machine to a user during their control of a robotic arm. Two conditions of algorithmic differences were studied using different scenarios of controlling a robotic arm: an automated motion system and human participant piloting. Contrary to expectations, off-policy learning did not provide the expected solution to the forgetting problem. We instead identified beneficial properties of a look-ahead state representation that made existing approaches able to learn (and not forget) predictions in support of Pavlovian signalling. This work therefore contributes new insight into the challenges of providing learned predictive feedback from a prosthetic device, and demonstrates avenues for more dynamic signalling in future human-machine interactions.

View on arXiv PDF

Similar