CVAINov 6, 2023

LDM3D-VR: Latent Diffusion Model for 3D VR

arXiv:2311.03226v112 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the limited generation of RGBD content for VR developers, though it appears incremental as it fine-tunes existing models on new data.

The authors tackled the problem of generating depth maps jointly with RGB for virtual reality development by introducing LDM3D-VR, a suite of diffusion models that create panoramic RGBD from text prompts and upscale low-resolution inputs to high-resolution RGBD, with evaluation showing competitive performance against existing methods.

Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes