ROAILGSYSep 16, 2025

Pre-trained Visual Representations Generalize Where it Matters in Model-Based Reinforcement Learning

arXiv:2509.12531v1h-index: 2
Originality Incremental advance
AI Analysis

This work addresses robustness in visual policy learning for robotic applications, though it is incremental as it builds on prior findings about PVMs in model-free reinforcement learning.

The paper tackled the problem of poor generalization to novel visual scene changes in model-based reinforcement learning (MBRL) by investigating the use of pre-trained vision models (PVMs), showing that PVMs perform much better than a baseline trained from scratch under severe visual domain shifts, with partial fine-tuning achieving the highest average task performance under extreme shifts.

In visuomotor policy learning, the control policy for the robotic agent is derived directly from visual inputs. The typical approach, where a policy and vision encoder are trained jointly from scratch, generalizes poorly to novel visual scene changes. Using pre-trained vision models (PVMs) to inform a policy network improves robustness in model-free reinforcement learning (MFRL). Recent developments in Model-based reinforcement learning (MBRL) suggest that MBRL is more sample-efficient than MFRL. However, counterintuitively, existing work has found PVMs to be ineffective in MBRL. Here, we investigate PVM's effectiveness in MBRL, specifically on generalization under visual domain shifts. We show that, in scenarios with severe shifts, PVMs perform much better than a baseline model trained from scratch. We further investigate the effects of varying levels of fine-tuning of PVMs. Our results show that partial fine-tuning can maintain the highest average task performance under the most extreme distribution shifts. Our results demonstrate that PVMs are highly successful in promoting robustness in visual policy learning, providing compelling evidence for their wider adoption in model-based robotic learning applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes