Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
This work addresses a practical engineering bottleneck for researchers and practitioners in reinforcement learning and generative modeling, offering a modular framework to reduce implementation overhead and accelerate prototyping, though it is incremental in nature.
The paper tackles the problem of fragmented codebases and engineering complexity in reinforcement learning for aligning flow-matching models with human preferences by introducing Flow-Factory, a unified framework that enables seamless integration of new algorithms and architectures, as demonstrated with support for GRPO, DiffusionNFT, and AWM across various models.
Reinforcement learning has emerged as a promising paradigm for aligning diffusion and flow-matching models with human preferences, yet practitioners face fragmented codebases, model-specific implementations, and engineering complexity. We introduce Flow-Factory, a unified framework that decouples algorithms, models, and rewards through through a modular, registry-based architecture. This design enables seamless integration of new algorithms and architectures, as demonstrated by our support for GRPO, DiffusionNFT, and AWM across Flux, Qwen-Image, and WAN video models. By minimizing implementation overhead, Flow-Factory empowers researchers to rapidly prototype and scale future innovations with ease. Flow-Factory provides production-ready memory optimization, flexible multi-reward training, and seamless distributed training support. The codebase is available at https://github.com/X-GenGroup/Flow-Factory.