LGCVJun 18, 2024

Visually Robust Adversarial Imitation Learning from Videos with Contrastive Learning

arXiv:2407.12792v210 citationsHas Code
AI Analysis

This addresses robust imitation learning for robotics in visually mismatched environments, representing an incremental advance.

The paper tackles imitation learning from videos with visual mismatches between agent and expert domains by proposing C-LAIfO, which uses contrastive learning for robust latent space estimation and adversarial imitation learning, showing improved performance on high-dimensional robotic tasks compared to baselines.

We propose C-LAIfO, a computationally efficient algorithm designed for imitation learning from videos in the presence of visual mismatch between agent and expert domains. We analyze the problem of imitation from expert videos with visual discrepancies, and introduce a solution for robust latent space estimation using contrastive learning and data augmentation. Provided a visually robust latent space, our algorithm performs imitation entirely within this space using off-policy adversarial imitation learning. We conduct a thorough ablation study to justify our design and test C-LAIfO on high-dimensional continuous robotic tasks. Additionally, we demonstrate how C-LAIfO can be combined with other reward signals to facilitate learning on a set of challenging hand manipulation tasks with sparse rewards. Our experiments show improved performance compared to baseline methods, highlighting the effectiveness of C-LAIfO. To ensure reproducibility, we open source our code.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes