CVApr 22, 2024

X-Ray: A Sequential 3D Representation For Generation

arXiv:2404.14329v29 citationsh-index: 13NIPS
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient 3D representation and generation for computer vision and graphics applications, though it appears incremental as it builds on video diffusion models.

The paper tackles the problem of generating 3D models from images by introducing X-Ray, a sequential 3D representation that transforms objects into surface frames, enabling state-of-the-art accuracy in 3D generation from single images.

We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans. X-Ray transforms a 3D object into a series of surface frames at different layers, making it suitable for generating 3D models from images. Our method utilizes ray casting from the camera center to capture geometric and textured details, including depth, normal, and color, across all intersected surfaces. This process efficiently condenses the whole 3D object into a multi-frame video format, motivating the utilize of a network architecture similar to those in video diffusion models. This design ensures an efficient 3D representation by focusing solely on surface information. Also, we propose a two-stage pipeline to generate 3D objects from X-Ray Diffusion Model and Upsampler. We demonstrate the practicality and adaptability of our X-Ray representation by synthesizing the complete visible and hidden surfaces of a 3D object from a single input image. Experimental results reveal the state-of-the-art superiority of our representation in enhancing the accuracy of 3D generation, paving the way for new 3D representation research and practical applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes