LGCVRODec 12, 2022

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

arXiv:2212.05749v278 citationsh-index: 40
AI Analysis

This work addresses the problem of evaluating pre-training methods for visuo-motor control, showing that domain gaps hinder current approaches, and provides a strong baseline for benchmarking, which is incremental as it revisits and refines existing ideas.

The paper examined the effectiveness of pre-training for visuo-motor control tasks and found that a simple Learning-from-Scratch baseline with data augmentation and a shallow ConvNet is competitive with recent methods using frozen pre-trained visual representations, across various algorithms, tasks, and metrics in simulation and on a real robot.

In this paper, we examine the effectiveness of pre-training for visuo-motor control tasks. We revisit a simple Learning-from-Scratch (LfS) baseline that incorporates data augmentation and a shallow ConvNet, and find that this baseline is surprisingly competitive with recent approaches (PVR, MVP, R3M) that leverage frozen visual representations trained on large-scale vision datasets -- across a variety of algorithms, task domains, and metrics in simulation and on a real robot. Our results demonstrate that these methods are hindered by a significant domain gap between the pre-training datasets and current benchmarks for visuo-motor control, which is alleviated by finetuning. Based on our findings, we provide recommendations for future research in pre-training for control and hope that our simple yet strong baseline will aid in accurately benchmarking progress in this area.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes