SA-Net: Deep Neural Network for Robot Trajectory Recognition from RGB-D Streams
This work addresses a crucial prerequisite for online learning from demonstration in robotics, though it appears incremental as it builds on existing deep learning approaches for a specific domain.
The paper tackles the problem of recognizing state-action pairs from RGB-D data streams for robot learning from demonstration, presenting SA-Net, a deep neural network that significantly improves accuracy over previous methods in diverse robotic applications.
Learning from demonstration (LfD) and imitation learning offer new paradigms for transferring task behavior to robots. A class of methods that enable such online learning require the robot to observe the task being performed and decompose the sensed streaming data into sequences of state-action pairs, which are then input to the methods. Thus, recognizing the state-action pairs correctly and quickly in sensed data is a crucial prerequisite for these methods. We present SA-Net a deep neural network architecture that recognizes state-action pairs from RGB-D data streams. SA-Net performed well in two diverse robotic applications of LfD -- one involving mobile ground robots and another involving a robotic manipulator -- which demonstrates that the architecture generalizes well to differing contexts. Comprehensive evaluations including deployment on a physical robot show that \sanet{} significantly improves on the accuracy of the previous method that utilizes traditional image processing and segmentation.