CVLGROJan 23

AnyView: Synthesizing Any Novel View in Dynamic Scenes

Georgia Tech
arXiv:2601.16982v13 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses the challenge of dynamic view synthesis for video generation, offering a generalist approach that could benefit applications in virtual reality and content creation, though it appears incremental in improving consistency.

The paper tackles the problem of generating novel views in dynamic scenes, where existing methods struggle with consistency, and introduces AnyView, a diffusion-based framework that achieves competitive results on benchmarks and maintains realistic outputs in extreme settings.

Modern generative video models excel at producing convincing, high-quality outputs, but struggle to maintain multi-view and spatiotemporal consistency in highly dynamic real-world environments. In this work, we introduce \textbf{AnyView}, a diffusion-based video generation framework for \emph{dynamic view synthesis} with minimal inductive biases or geometric assumptions. We leverage multiple data sources with various levels of supervision, including monocular (2D), multi-view static (3D) and multi-view dynamic (4D) datasets, to train a generalist spatiotemporal implicit representation capable of producing zero-shot novel videos from arbitrary camera locations and trajectories. We evaluate AnyView on standard benchmarks, showing competitive results with the current state of the art, and propose \textbf{AnyViewBench}, a challenging new benchmark tailored towards \emph{extreme} dynamic view synthesis in diverse real-world scenarios. In this more dramatic setting, we find that most baselines drastically degrade in performance, as they require significant overlap between viewpoints, while AnyView maintains the ability to produce realistic, plausible, and spatiotemporally consistent videos when prompted from \emph{any} viewpoint. Results, data, code, and models can be viewed at: https://tri-ml.github.io/AnyView/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes