GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints
This work addresses a domain-specific problem for 3D reconstruction and computer vision, offering incremental improvements by integrating geometry into descriptor learning.
The paper tackles the problem of learned local descriptors not generalizing well to image-based 3D reconstruction by proposing GeoDesc, which integrates geometry constraints from multi-view reconstructions, resulting in superior performance on large-scale benchmarks and improved accuracy-efficiency trade-offs in reconstruction tasks.
Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction. In this paper, we mitigate this limitation by proposing a novel local descriptor learning approach that integrates geometry constraints from multi-view reconstructions, which benefits the learning process in terms of data generation, data sampling and loss computation. We refer to the proposed descriptor as GeoDesc, and demonstrate its superior performance on various large-scale benchmarks, and in particular show its great success on challenging reconstruction tasks. Moreover, we provide guidelines towards practical integration of learned descriptors in Structure-from-Motion (SfM) pipelines, showing the good trade-off that GeoDesc delivers to 3D reconstruction tasks between accuracy and efficiency.