ROLGOct 19, 2024

MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

Tsinghua
arXiv:2410.14972v331 citationsh-index: 6ICML
Originality Highly original
AI Analysis

This addresses sample efficiency for real-world robotic manipulation, representing a strong specific gain rather than a foundational breakthrough.

The paper tackles low sample efficiency in visual deep reinforcement learning for robotics by introducing MENTOR, which uses a mixture-of-experts backbone and task-oriented perturbation, achieving an average 83% success rate on real-world tasks compared to 32% for prior methods.

Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks. However, current algorithms suffer from low sample efficiency, limiting their practical applicability. In this work, we present MENTOR, a method that improves both the architecture and optimization of RL agents. Specifically, MENTOR replaces the standard multi-layer perceptron (MLP) with a mixture-of-experts (MoE) backbone and introduces a task-oriented perturbation mechanism. MENTOR outperforms state-of-the-art methods across three simulation benchmarks and achieves an average of 83% success rate on three challenging real-world robotic manipulation tasks, significantly surpassing the 32% success rate of the strongest existing model-free visual RL algorithm. These results underscore the importance of sample efficiency in advancing visual RL for real-world robotics. Experimental videos are available at https://suninghuang19.github.io/mentor_page/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes