LGMar 19

AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

arXiv:2603.1846497.8h-index: 6
AI Analysis

This work addresses efficiency and scalability issues for researchers and practitioners deploying large-scale VLA models in complex control tasks, representing a novel integration rather than an incremental improvement.

The paper tackles the computational efficiency and data acquisition challenges in reinforcement learning for large-scale Vision-Language-Action models by proposing AcceRL, a distributed asynchronous framework that integrates a trainable world model, achieving state-of-the-art performance on the LIBERO benchmark with super-linear scaling and improved sample efficiency.

Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experiments on the LIBERO benchmark demonstrate that AcceRL achieves state-of-the-art (SOTA) performance. Systematically, it exhibits super-linear scaling in throughput and highly efficient hardware utilization. Algorithmically, the world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes