ROAICVJul 30, 2024

VITAL: Interactive Few-Shot Imitation Learning via Visual Human-in-the-Loop Corrections

arXiv:2407.21244v2h-index: 16
Originality Incremental advance
AI Analysis

This addresses data efficiency and adaptability issues in robotics for real-world manipulation tasks, though it is incremental as it builds on existing imitation learning and data augmentation methods.

The paper tackles the challenge of high data collection costs in imitation learning for robotics by generating large-scale training data from few demonstrations using simulation and human-in-the-loop corrections, resulting in robust policies validated on tasks like bottle collecting and hammering with improved generalizability to new tasks such as setting a drink tray.

Imitation Learning (IL) has emerged as a powerful approach in robotics, allowing robots to acquire new skills by mimicking human actions. Despite its potential, the data collection process for IL remains a significant challenge due to the logistical difficulties and high costs associated with obtaining high-quality demonstrations. To address these issues, we propose a large-scale data generation from a handful of demonstrations through data augmentation in simulation. Our approach leverages affordable hardware and visual processing techniques to collect demonstrations, which are then augmented to create extensive training datasets for imitation learning. By utilizing both real and simulated environments, along with human-in-the-loop corrections, we enhance the generalizability and robustness of the learned policies. We evaluated our method through several rounds of experiments in both simulated and real-robot settings, focusing on tasks of varying complexity, including bottle collecting, stacking objects, and hammering. Our experimental results validate the effectiveness of our approach in learning robust robot policies from simulated data, significantly improved by human-in-the-loop corrections and real-world data integration. Additionally, we demonstrate the framework's capability to generalize to new tasks, such as setting a drink tray, showcasing its adaptability and potential for handling a wide range of real-world manipulation tasks. A video of the experiments can be found at: https://youtu.be/YeVAMRqRe64?si=R179xDlEGc7nPu8i

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes