Building Rome with Convex Optimization
This work addresses the challenge of efficient and robust 3D reconstruction from images for computer vision applications, representing a strong specific gain rather than a broad paradigm shift.
The authors tackled the problem of global bundle adjustment in structure from motion by proposing a convex optimization approach with learned depth prediction, achieving certifiable global optimality and demonstrating that their pipeline reconstructs with comparable quality while being significantly faster, more scalable, and initialization-free.
Global bundle adjustment is made easy by depth prediction and convex optimization. We (i) propose a scaled bundle adjustment (SBA) formulation that lifts 2D keypoint measurements to 3D with learned depth, (ii) design an empirically tight convex semidfinite program (SDP) relaxation that solves SBA to certfiable global optimality, (iii) solve the SDP relaxations at extreme scale with Burer-Monteiro factorization and a CUDA-based trust-region Riemannian optimizer (dubbed XM), (iv) build a structure from motion (SfM) pipeline with XM as the optimization engine and show that XM-SfM compares favorably with existing pipelines in terms of reconstruction quality while being significantly faster, more scalable, and initialization-free.