NC LG NEAug 23, 2022

What deep reinforcement learning tells us about human motor learning and vice-versa

Michele Garibbo, Casimir Ludwig, Nathan Lepora, Laurence Aitchison

arXiv:2208.10892v22.31 citationsh-index: 37

Originality Highly original

AI Analysis

This work bridges neuroscience and AI by identifying limitations in current deep RL for motor adaptation, offering a new algorithm that could enhance robotic control or rehabilitation technologies.

The study tackled the gap between deep reinforcement learning (RL) algorithms and human motor learning by testing them on mirror reversal perturbations, finding that existing algorithms failed to mimic human behavior. To address this, they introduced a novel algorithm, MB-DPG, which captured human error-based learning and learned faster than model-free methods on complex reaching tasks.

Machine learning and specifically reinforcement learning (RL) has been extremely successful in helping us to understand neural decision making processes. However, RL's role in understanding other neural processes especially motor learning is much less well explored. To explore this connection, we investigated how recent deep RL methods correspond to the dominant motor learning framework in neuroscience, error-based learning. Error-based learning can be probed using a mirror reversal adaptation paradigm, where it produces distinctive qualitative predictions that are observed in humans. We therefore tested three major families of modern deep RL algorithm on a mirror reversal perturbation. Surprisingly, all of the algorithms failed to mimic human behaviour and indeed displayed qualitatively different behaviour from that predicted by error-based learning. To fill this gap, we introduce a novel deep RL algorithm: model-based deterministic policy gradients (MB-DPG). MB-DPG draws inspiration from error-based learning by explicitly relying on the observed outcome of actions. We show MB-DPG captures (human) error-based learning under mirror-reversal and rotational perturbation. Next, we demonstrate error-based learning in the form of MB-DPG learns faster than canonical model-free algorithms on complex arm-based reaching tasks, while being more robust to (forward) model misspecification than model-based RL. These findings highlight the gap between current deep RL methods and human motor adaptation and offer a route to closing this gap, facilitating future beneficial interaction between between the two fields.

View on arXiv PDF

Similar