CVFeb 23

One2Scene: Geometric Consistent Explorable 3D Scene Generation from a Single Image

arXiv:2602.19766v13 citationsh-index: 7
Originality Highly original
AI Analysis

This addresses a challenging problem in 3D vision for applications like virtual reality and robotics, offering a novel method that is not incremental but builds on existing techniques to enhance geometric consistency.

The paper tackles the problem of generating explorable 3D scenes from a single image, which often suffers from geometric distortions and artifacts during viewpoint changes, and introduces One2Scene, a framework that decomposes the task into sub-tasks using a panorama generator, Gaussian Splatting network, and novel view generator to achieve stable, immersive exploration with substantial performance improvements over state-of-the-art methods.

Generating explorable 3D scenes from a single image is a highly challenging problem in 3D vision. Existing methods struggle to support free exploration, often producing severe geometric distortions and noisy artifacts when the viewpoint moves far from the original perspective. We introduce \textbf{One2Scene}, an effective framework that decomposes this ill-posed problem into three tractable sub-tasks to enable immersive explorable scene generation. We first use a panorama generator to produce anchor views from a single input image as initialization. Then, we lift these 2D anchors into an explicit 3D geometric scaffold via a generalizable, feed-forward Gaussian Splatting network. Instead of treating the panorama as a single image for reconstruction, we project it into multiple sparse anchor views and reformulate the reconstruction task as multi-view stereo matching, which allows us to leverage robust geometric priors learned from large-scale multi-view datasets. A bidirectional feature fusion module is used to enforce cross-view consistency, yielding an efficient and geometrically reliable scaffold. Finally, the scaffold serves as a strong prior for a novel view generator to produce photorealistic and geometrically accurate views at arbitrary cameras. By explicitly conditioning on a 3D-consistent scaffold to perform reconstruction, One2Scene works stably under large camera motions, supporting immersive scene exploration. Extensive experiments show that One2Scene substantially outperforms state-of-the-art methods in panorama depth estimation, feed-forward 360° reconstruction, and explorable 3D scene generation. Code and models will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes