Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks
This addresses 3D reconstruction for robotics and computer vision, offering an efficient alternative to expensive methods, though it appears incremental as it builds on existing Gaussian and equivariant techniques.
The paper tackles single-view 3D reconstruction by introducing SO(2)-Equivariant Gaussian Sculpting Networks (GSNs), which generate Gaussian splat representations from images, achieving competitive quality with diffusion-based methods and high throughput (>150 FPS).
This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as an approach for SO(2)-Equivariant 3D object reconstruction from single-view image observations. GSNs take a single observation as input to generate a Gaussian splat representation describing the observed object's geometry and texture. By using a shared feature extractor before decoding Gaussian colors, covariances, positions, and opacities, GSNs achieve extremely high throughput (>150FPS). Experiments demonstrate that GSNs can be trained efficiently using a multi-view rendering loss and are competitive, in quality, with expensive diffusion-based reconstruction algorithms. The GSN model is validated on multiple benchmark experiments. Moreover, we demonstrate the potential for GSNs to be used within a robotic manipulation pipeline for object-centric grasping.