CVAIDec 12, 2025

Flowception: Temporally Expansive Flow Matching for Video Generation

arXiv:2512.11438v12 citationsh-index: 35
Originality Incremental advance
AI Analysis

This addresses video generation for AI applications by offering a more efficient and flexible method, though it appears incremental as it builds on existing flow matching techniques.

The paper tackles video generation by introducing Flowception, a non-autoregressive framework that interleaves discrete frame insertions with continuous denoising to reduce error accumulation and computational cost, achieving improved FVD and VBench metrics over baselines.

We present Flowception, a novel non-autoregressive and variable-length video generation framework. Flowception learns a probability path that interleaves discrete frame insertions with continuous frame denoising. Compared to autoregressive methods, Flowception alleviates error accumulation/drift as the frame insertion mechanism during sampling serves as an efficient compression mechanism to handle long-term context. Compared to full-sequence flows, our method reduces FLOPs for training three-fold, while also being more amenable to local attention variants, and allowing to learn the length of videos jointly with their content. Quantitative experimental results show improved FVD and VBench metrics over autoregressive and full-sequence baselines, which is further validated with qualitative results. Finally, by learning to insert and denoise frames in a sequence, Flowception seamlessly integrates different tasks such as image-to-video generation and video interpolation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes