CV AINov 6, 2023

LDM3D-VR: Latent Diffusion Model for 3D VR

Gabriela Ben Melech Stan, Diana Wofk, Estelle Aflalo, Shao-Yen Tseng, Zhipeng Cai, Michael Paulitsch, Vasudev Lal

arXiv:2311.03226v18.412 citationsh-index: 1

Originality Synthesis-oriented

AI Analysis

This addresses the limited generation of RGBD content for VR developers, though it appears incremental as it fine-tunes existing models on new data.

The authors tackled the problem of generating depth maps jointly with RGB for virtual reality development by introducing LDM3D-VR, a suite of diffusion models that create panoramic RGBD from text prompts and upscale low-resolution inputs to high-resolution RGBD, with evaluation showing competitive performance against existing methods.

Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.

View on arXiv PDF

Similar