End-to-end grasping policies for human-in-the-loop robots via deep reinforcement learning
This work addresses robustness issues in EMG-based human-in-the-loop grasping for robotics, offering a method to enhance policy training and transparency, though it appears incremental by integrating existing techniques.
The paper tackles the problem of robust human-in-the-loop robot grasping by developing an end-to-end training method using reinforcement and imitation learning in a stochastic simulation environment with real human trajectories, achieving improved policy performance through data augmentation and selection.
State-of-the-art human-in-the-loop robot grasping is hugely suffered by Electromyography (EMG) inference robustness issues. As a workaround, researchers have been looking into integrating EMG with other signals, often in an ad hoc manner. In this paper, we are presenting a method for end-to-end training of a policy for human-in-the-loop robot grasping on real reaching trajectories. For this purpose we use Reinforcement Learning (RL) and Imitation Learning (IL) in DEXTRON (DEXTerity enviRONment), a stochastic simulation environment with real human trajectories that are augmented and selected using a Monte Carlo (MC) simulation method. We also offer a success model which once trained on the expert policy data and the RL policy roll-out transitions, can provide transparency to how the deep policy works and when it is probably going to fail.