CVAIFeb 23, 2025

Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control

arXiv:2502.16475v11 citationsh-index: 1
Originality Highly original
AI Analysis

This work solves the problem of generating and editing 3D models from single images for applications in virtual reality and digital content creation, representing a novel method for a known bottleneck.

The paper tackles the problem of single-image 3D generation by addressing multi-view geometric inconsistency and limited controllability, introducing Dragen3D which achieves geometrically consistent and controllable 3D generation with quality comparable to state-of-the-art methods.

Single-image 3D generation has emerged as a prominent research topic, playing a vital role in virtual reality, 3D modeling, and digital content creation. However, existing methods face challenges such as a lack of multi-view geometric consistency and limited controllability during the generation process, which significantly restrict their usability. % To tackle these challenges, we introduce Dragen3D, a novel approach that achieves geometrically consistent and controllable 3D generation leveraging 3D Gaussian Splatting (3DGS). We introduce the Anchor-Gaussian Variational Autoencoder (Anchor-GS VAE), which encodes a point cloud and a single image into anchor latents and decode these latents into 3DGS, enabling efficient latent-space generation. To enable multi-view geometry consistent and controllable generation, we propose a Seed-Point-Driven strategy: first generate sparse seed points as a coarse geometry representation, then map them to anchor latents via the Seed-Anchor Mapping Module. Geometric consistency is ensured by the easily learned sparse seed points, and users can intuitively drag the seed points to deform the final 3DGS geometry, with changes propagated through the anchor latents. To the best of our knowledge, we are the first to achieve geometrically controllable 3D Gaussian generation and editing without relying on 2D diffusion priors, delivering comparable 3D generation quality to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes