CVLGJan 23

Reward-Forcing: Autoregressive Video Generation with Reward Feedback

arXiv:2601.16933v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient and scalable video generation for applications requiring near real-time output, though it appears incremental relative to existing autoregressive approaches.

The paper tackles the problem of autoregressive video generation by using reward signals to guide the process, achieving comparable performance to state-of-the-art methods with a VBench score of 84.92.

While most prior work in video generation relies on bidirectional architectures, recent efforts have sought to adapt these models into autoregressive variants to support near real-time generation. However, such adaptations often depend heavily on teacher models, which can limit performance, particularly in the absence of a strong autoregressive teacher, resulting in output quality that typically lags behind their bidirectional counterparts. In this paper, we explore an alternative approach that uses reward signals to guide the generation process, enabling more efficient and scalable autoregressive generation. By using reward signals to guide the model, our method simplifies training while preserving high visual fidelity and temporal consistency. Through extensive experiments on standard benchmarks, we find that our approach performs comparably to existing autoregressive models and, in some cases, surpasses similarly sized bidirectional models by avoiding constraints imposed by teacher architectures. For example, on VBench, our method achieves a total score of 84.92, closely matching state-of-the-art autoregressive methods that score 84.31 but require significant heterogeneous distillation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes