HyNet: Learning Local Descriptor with Hybrid Similarity Measure and Triplet Loss
This work addresses the problem of improving local descriptor matching for computer vision applications, representing an incremental advancement with strong specific gains.
The paper tackles the problem of local descriptor learning by analyzing the effect of L2 normalization on gradients and proposes HyNet, which achieves state-of-the-art results in matching tasks, outperforming previous methods by a significant margin on benchmarks including patch matching, verification, retrieval, and 3D reconstruction.
Recent works show that local descriptor learning benefits from the use of L2 normalisation, however, an in-depth analysis of this effect lacks in the literature. In this paper, we investigate how L2 normalisation affects the back-propagated descriptor gradients during training. Based on our observations, we propose HyNet, a new local descriptor that leads to state-of-the-art results in matching. HyNet introduces a hybrid similarity measure for triplet margin loss, a regularisation term constraining the descriptor norm, and a new network architecture that performs L2 normalisation of all intermediate feature maps and the output descriptors. HyNet surpasses previous methods by a significant margin on standard benchmarks that include patch matching, verification, and retrieval, as well as outperforming full end-to-end methods on 3D reconstruction tasks.