CVAug 25, 2025

Camera Pose Refinement via 3D Gaussian Splatting

Lulu Hao, Lipu Zhou, Zhenzhong Wei, Xu Wang

arXiv:2508.17876v1h-index: 1

Originality Incremental advance

AI Analysis

This addresses the problem of inaccurate camera pose estimation in 3D computer vision, offering a lightweight solution that can be applied to diverse scenes without retraining, though it is incremental as it builds on existing 3DGS methods.

The paper tackles camera pose refinement by proposing a framework that uses 3D Gaussian Splatting to render views and apply epipolar constraints, achieving reductions in median translation and rotation errors of 53.3% and 56.9% on 7-Scenes and 40.7% and 53.2% on Cambridge Landmarks datasets.

Camera pose refinement aims at improving the accuracy of initial pose estimation for applications in 3D computer vision. Most refinement approaches rely on 2D-3D correspondences with specific descriptors or dedicated networks, requiring reconstructing the scene again for a different descriptor or fully retraining the network for each scene. Some recent methods instead infer pose from feature similarity, but their lack of geometry constraints results in less accuracy. To overcome these limitations, we propose a novel camera pose refinement framework leveraging 3D Gaussian Splatting (3DGS), referred to as GS-SMC. Given the widespread usage of 3DGS, our method can employ an existing 3DGS model to render novel views, providing a lightweight solution that can be directly applied to diverse scenes without additional training or fine-tuning. Specifically, we introduce an iterative optimization approach, which refines the camera pose using epipolar geometric constraints among the query and multiple rendered images. Our method allows flexibly choosing feature extractors and matchers to establish these constraints. Extensive empirical evaluations on the 7-Scenes and the Cambridge Landmarks datasets demonstrate that our method outperforms state-of-the-art camera pose refinement approaches, achieving 53.3% and 56.9% reductions in median translation and rotation errors on 7-Scenes, and 40.7% and 53.2% on Cambridge.

View on arXiv PDF

Similar