CVJul 21, 2025

CHROMA: Consistent Harmonization of Multi-View Appearance via Bilateral Grid Prediction

arXiv:2507.15748v33 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses appearance variations in multi-view consistency for 3D reconstruction, offering a generalizable solution that is incremental in improving efficiency over existing methods.

The paper tackles photometric inconsistencies in multi-view images caused by camera processing, which degrade novel view synthesis, by proposing a feed-forward method that predicts bilateral grids to harmonize appearance efficiently, matching or outperforming scene-specific optimization methods without increasing training time.

Modern camera pipelines apply extensive on-device processing, such as exposure adjustment, white balance, and color correction, which, while beneficial individually, often introduce photometric inconsistencies across views. These appearance variations violate multi-view consistency and degrade novel view synthesis. Joint optimization of scene-specific representations and per-image appearance embeddings has been proposed to address this issue, but with increased computational complexity and slower training. In this work, we propose a generalizable, feed-forward approach that predicts spatially adaptive bilateral grids to correct photometric variations in a multi-view consistent manner. Our model processes hundreds of frames in a single step, enabling efficient large-scale harmonization, and seamlessly integrates into downstream 3D reconstruction models, providing cross-scene generalization without requiring scene-specific retraining. To overcome the lack of paired data, we employ a hybrid self-supervised rendering loss leveraging 3D foundation models, improving generalization to real-world variations. Extensive experiments show that our approach outperforms or matches the reconstruction quality of existing scene-specific optimization methods with appearance modeling, without significantly affecting the training time of baseline 3D models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes