CV LGJul 29, 2023

RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Sascha Kirch, Valeria Olyunina, Jan Ondřej, Rafael Pagés, Sergio Martin, Clara Pérez-Molina

arXiv:2307.15988v12.83 citationsh-index: 28Has Code

Originality Incremental advance

AI Analysis

This work addresses depth estimation for humanoid subjects, which is incremental as it builds on existing diffusion models with specific enhancements.

The paper tackles generating high-resolution depth maps from low-resolution RGB images of humanoid subjects, achieving this through a multi-modal conditional diffusion model with a novel depth noise augmentation technique.

We present RGB-D-Fusion, a multi-modal conditional denoising diffusion probabilistic model to generate high resolution depth maps from low-resolution monocular RGB images of humanoid subjects. RGB-D-Fusion first generates a low-resolution depth map using an image conditioned denoising diffusion probabilistic model and then upsamples the depth map using a second denoising diffusion probabilistic model conditioned on a low-resolution RGB-D image. We further introduce a novel augmentation technique, depth noise augmentation, to increase the robustness of our super-resolution model.

View on arXiv PDF Code

Similar