LGMar 29

FlowRL: A Taxonomy and Modular Framework for Reinforcement Learning with Diffusion Policies

Chenxiao Gao, Edward Chen, Tianyi Chen, Bo Dai

arXiv:2603.2745089.5h-index: 6Has Code

Predicted impact top 8% in LG · last 90 daysOriginality Synthesis-oriented

AI Analysis

It offers a unified perspective and practical toolkit for researchers and practitioners working on RL with diffusion policies, addressing the lack of standardization in the field.

The paper introduces a taxonomy for reinforcement learning with diffusion/flow policies and provides a modular JAX-based framework for high-throughput training, along with standardized benchmarks across multiple environments to guide algorithm selection.

Thanks to their remarkable flexibility, diffusion models and flow models have emerged as promising candidates for policy representation. However, efficient reinforcement learning (RL) upon these policies remains a challenge due to the lack of explicit log-probabilities for vanilla policy gradient estimators. While numerous attempts have been proposed to address this, the field lacks a unified perspective to reconcile these seemingly disparate methods, thus hampering ongoing development. In this paper, we bridge this gap by introducing a comprehensive taxonomy for RL algorithms with diffusion/flow policies. To support reproducibility and agile prototyping, we introduce a modular, JAX-based open-source codebase that leverages JIT-compilation for high-throughput training. Finally, we provide systematic and standardized benchmarks across Gym-Locomotion, DeepMind Control Suite, and IsaacLab, offering a rigorous side-by-side comparison of diffusion-based methods and guidance for practitioners to choose proper algorithms based on the application. Our work establishes a clear foundation for understanding and algorithm design, a high-efficiency toolkit for future research in the field, and an algorithmic guideline for practitioners in generative models and robotics. Our code is available at https://github.com/typoverflow/flow-rl.

View on arXiv PDF Code

Similar