DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering
This work improves inverse rendering for indoor scenes by providing a more robust and efficient approach, though it is incremental as it builds on existing diffusion model frameworks.
The paper tackles the problem of indoor inverse rendering by addressing the limitations of noise-to-intrinsic diffusion models, which use noisy images that degrade structure and appearance information. The proposed DNF-Intrinsic method uses source images as input with flow matching to predict deterministic intrinsic properties, achieving clear outperformance over state-of-the-art methods on synthetic and real-world datasets.
Recent methods have shown that pre-trained diffusion models can be fine-tuned to enable generative inverse rendering by learning image-conditioned noise-to-intrinsic mapping. Despite their remarkable progress, they struggle to robustly produce high-quality results as the noise-to-intrinsic paradigm essentially utilizes noisy images with deteriorated structure and appearance for intrinsic prediction, while it is common knowledge that structure and appearance information in an image are crucial for inverse rendering. To address this issue, we present DNF-Intrinsic, a robust yet efficient inverse rendering approach fine-tuned from a pre-trained diffusion model, where we propose to take the source image rather than Gaussian noise as input to directly predict deterministic intrinsic properties via flow matching. Moreover, we design a generative renderer to constrain that the predicted intrinsic properties are physically faithful to the source image. Experiments on both synthetic and real-world datasets show that our method clearly outperforms existing state-of-the-art methods.