CVDec 24, 2025

MVInverse: Feed-forward Multi-view Inverse Rendering in Seconds

arXiv:2512.21003v22 citationsh-index: 4
Originality Highly original
AI Analysis

This addresses the challenge of computationally expensive and inconsistent multi-view inverse rendering for applications in computer vision and graphics.

The paper tackles the problem of multi-view inverse rendering by introducing a feed-forward framework that predicts scene properties from RGB images in seconds, achieving state-of-the-art performance in consistency and generalization to real-world scenes.

Multi-view inverse rendering aims to recover geometry, materials, and illumination consistently across multiple viewpoints. When applied to multi-view images, existing single-view approaches often ignore cross-view relationships, leading to inconsistent results. In contrast, multi-view optimization methods rely on slow differentiable rendering and per-scene refinement, making them computationally expensive and hard to scale. To address these limitations, we introduce a feed-forward multi-view inverse rendering framework that directly predicts spatially varying albedo, metallic, roughness, diffuse shading, and surface normals from sequences of RGB images. By alternating attention across views, our model captures both intra-view long-range lighting interactions and inter-view material consistency, enabling coherent scene-level reasoning within a single forward pass. Due to the scarcity of real-world training data, models trained on existing synthetic datasets often struggle to generalize to real-world scenes. To overcome this limitation, we propose a consistency-based finetuning strategy that leverages unlabeled real-world videos to enhance both multi-view coherence and robustness under in-the-wild conditions. Extensive experiments on benchmark datasets demonstrate that our method achieves state-of-the-art performance in terms of multi-view consistency, material and normal estimation quality, and generalization to real-world imagery. Project page: https://maddog241.github.io/mvinverse-page/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes