Local Descriptors Optimized for Average Precision
This improves descriptor matching for computer vision tasks, but is incremental as it builds on existing learning-based approaches.
The paper tackles the problem of learning local feature descriptors by directly optimizing Average Precision for descriptor matching, achieving state-of-the-art results in patch verification, patch retrieval, and image matching on standard benchmarks.
Extraction of local feature descriptors is a vital stage in the solution pipelines for numerous computer vision tasks. Learning-based approaches improve performance in certain tasks, but still cannot replace handcrafted features in general. In this paper, we improve the learning of local feature descriptors by optimizing the performance of descriptor matching, which is a common stage that follows descriptor extraction in local feature based pipelines, and can be formulated as nearest neighbor retrieval. Specifically, we directly optimize a ranking-based retrieval performance metric, Average Precision, using deep neural networks. This general-purpose solution can also be viewed as a listwise learning to rank approach, which is advantageous compared to recent local ranking approaches. On standard benchmarks, descriptors learned with our formulation achieve state-of-the-art results in patch verification, patch retrieval, and image matching.