Towards Understanding 3D Vision: the Role of Gaussian Curvature
This addresses the problem of interpretability and transferability in 3D vision for researchers, though it is incremental as it proposes a prior rather than a new method.
The paper tackles the lack of explicit 3D geometry models in deep learning-based vision by investigating Gaussian curvature, showing it provides a sparse description of surfaces and correlates with performance of top methods on the Middlebury dataset.
Recent advances in computer vision have predominantly relied on data-driven approaches that leverage deep learning and large-scale datasets. Deep neural networks have achieved remarkable success in tasks such as stereo matching and monocular depth reconstruction. However, these methods lack explicit models of 3D geometry that can be directly analyzed, transferred across modalities, or systematically modified for controlled experimentation. We investigate the role of Gaussian curvature in 3D surface modeling. Besides Gaussian curvature being an invariant quantity under change of observers or coordinate systems, we demonstrate using the Middlebury stereo dataset that it offers a sparse and compact description of 3D surfaces. Furthermore, we show a strong correlation between the performance rank of top state-of-the-art stereo and monocular methods and the low total absolute Gaussian curvature. We propose that this property can serve as a geometric prior to improve future 3D reconstruction algorithms.