Skel3D: Skeleton Guided Novel View Synthesis
This work addresses the problem of generating consistent novel views from single images for 3D object synthesis, representing an incremental improvement by integrating skeleton guidance into existing diffusion-based approaches.
The paper tackles monocular open-set novel view synthesis by leveraging object skeletons to guide a diffusion model, resulting in improved pose accuracy and multi-view consistency across diverse object categories in the Objaverse dataset, outperforming state-of-the-art methods quantitatively and qualitatively.
In this paper, we present an approach for monocular open-set novel view synthesis (NVS) that leverages object skeletons to guide the underlying diffusion model. Building upon a baseline that utilizes a pre-trained 2D image generator, our method takes advantage of the Objaverse dataset, which includes animated objects with bone structures. By introducing a skeleton guide layer following the existing ray conditioning normalization (RCN) layer, our approach enhances pose accuracy and multi-view consistency. The skeleton guide layer provides detailed structural information for the generative model, improving the quality of synthesized views. Experimental results demonstrate that our skeleton-guided method significantly enhances consistency and accuracy across diverse object categories within the Objaverse dataset. Our method outperforms existing state-of-the-art NVS techniques both quantitatively and qualitatively, without relying on explicit 3D representations.