CVAILGDec 5, 2025

InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem

arXiv:2512.05672v14 citationsHas Code
Originality Highly original
AI Analysis

This addresses the problem of efficient and high-quality video generation for researchers and practitioners in computer vision, offering a novel method that reduces computational costs while maintaining performance.

The paper tackles the computational expense and catastrophic forgetting in controllable 4D video generation by proposing InverseCrafter, which reformulates the task as a latent domain inpainting problem, achieving comparable novel view generation and superior measurement consistency with near-zero computational overhead.

Recent approaches to controllable 4D video generation often rely on fine-tuning pre-trained Video Diffusion Models (VDMs). This dominant paradigm is computationally expensive, requiring large-scale datasets and architectural modifications, and frequently suffers from catastrophic forgetting of the model's original generative priors. Here, we propose InverseCrafter, an efficient inpainting inverse solver that reformulates the 4D generation task as an inpainting problem solved in the latent space. The core of our method is a principled mechanism to encode the pixel space degradation operator into a continuous, multi-channel latent mask, thereby bypassing the costly bottleneck of repeated VAE operations and backpropagation. InverseCrafter not only achieves comparable novel view generation and superior measurement consistency in camera control tasks with near-zero computational overhead, but also excels at general-purpose video inpainting with editing. Code is available at https://github.com/yeobinhong/InverseCrafter.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes