RO AI LGJan 18, 2024

Imitation Learning Inputting Image Feature to Each Layer of Neural Network

Koki Yamane, Sho Sakaino, Toshiaki Tsuji

arXiv:2401.09691v22.24 citationsAMC

Originality Incremental advance

AI Analysis

This addresses a problem in robotics for imitation learning with multimodal data, but it is incremental as it builds on existing end-to-end approaches.

The paper tackles the challenge of imitation learning with multimodal data, where low-correlation inputs like images are often ignored, especially with short sampling periods. By inputting image features into each neural network layer, the method significantly improves success rates in pick-and-place experiments.

Imitation learning enables robots to learn and replicate human behavior from training data. Recent advances in machine learning enable end-to-end learning approaches that directly process high-dimensional observation data, such as images. However, these approaches face a critical challenge when processing data from multiple modalities, inadvertently ignoring data with a lower correlation to the desired output, especially when using short sampling periods. This paper presents a useful method to address this challenge, which amplifies the influence of data with a relatively low correlation to the output by inputting the data into each neural network layer. The proposed approach effectively incorporates diverse data sources into the learning process. Through experiments using a simple pick-and-place operation with raw images and joint information as input, significant improvements in success rates are demonstrated even when dealing with data from short sampling periods.

View on arXiv PDF

Similar