ROMay 26

Can VLA Models Learn from Real-World Data Continually without Forgetting?

Jiarun Zhu, Yijun Hong, Xiaoquan Sun, Zetian Xu, Mingqi Yuan, Zhiyong Wang, Wenjun Zeng, Jiayu Chen

arXiv:2605.2682068.8

Predicted impact top 26% in RO · last 90 daysOriginality Incremental advance

AI Analysis

For robotics researchers aiming to deploy VLA models that continuously acquire new skills without forgetting, this work offers the first real-world empirical insights and practical guidance.

This paper studies continual learning for vision-language-action (VLA) models in real-world robotics, finding that they suffer significant catastrophic forgetting when learning heterogeneous tasks sequentially. It provides the first empirical study on this topic and identifies key factors for successful experience replay.

Vision-language-action (VLA) models provide a promising foundation for general-purpose robotics. However, their successful deployment in real-world scenarios requires the ability to continually acquire new skills while retaining previously learned behaviors. While pioneering research has studied the continual learning of VLA models in narrowly simulated environments, this challenge remains largely unexplored under realistic conditions. To address this limitation, we construct a real-world continual learning dataset comprising four sequential manipulation tasks, spanning rigid-object pick-and-place, contact-rich pressing, and deformable-object folding. Using this dataset, we conduct comprehensive experiments and find that VLA models suffer significant catastrophic forgetting when continually learning from heterogeneous real-world demonstrations. We then systematically evaluate experience replay and uncover key implementation factors that govern its success. In summary, this work provides the first empirical study of real-world continual VLA learning and offers practical guidance for deploying long-lived robot policies.

View on arXiv PDF

Similar