GeoFusionLRM: Geometry-Aware Self-Correction for Consistent 3D Reconstruction
This work addresses geometric fidelity issues in 3D reconstruction for applications like computer vision and graphics, representing an incremental improvement over existing methods.
The paper tackled the problem of geometric inconsistencies and misaligned details in single-image 3D reconstruction with large reconstruction models, resulting in improved alignment, sharper geometry, and higher fidelity compared to state-of-the-art baselines.
Single-image 3D reconstruction with large reconstruction models (LRMs) has advanced rapidly, yet reconstructions often exhibit geometric inconsistencies and misaligned details that limit fidelity. We introduce GeoFusionLRM, a geometry-aware self-correction framework that leverages the model's own normal and depth predictions to refine structural accuracy. Unlike prior approaches that rely solely on features extracted from the input image, GeoFusionLRM feeds back geometric cues through a dedicated transformer and fusion module, enabling the model to correct errors and enforce consistency with the conditioning image. This design improves the alignment between the reconstructed mesh and the input views without additional supervision or external signals. Extensive experiments demonstrate that GeoFusionLRM achieves sharper geometry, more consistent normals, and higher fidelity than state-of-the-art LRM baselines.