CVJan 19, 2016

PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors

arXiv:1601.05030v1180 citations
Originality Incremental advance
AI Analysis

This addresses a practical bottleneck for computer vision applications requiring efficient local image matching.

The paper tackles the problem of high computational complexity in CNN-based local image descriptors by proposing PN-Net, which achieves improved matching performance while significantly reducing training/execution time and maintaining low dimensionality. The 128-dimensional descriptor extraction time on GPU is comparable to fast binary descriptors like BRIEF and ORB.

In this paper we propose a new approach for learning local descriptors for matching image patches. It has recently been demonstrated that descriptors based on convolutional neural networks (CNN) can significantly improve the matching performance. Unfortunately their computational complexity is prohibitive for any practical application. We address this problem and propose a CNN based descriptor with improved matching performance, significantly reduced training and execution time, as well as low dimensionality. We propose to train the network with triplets of patches that include a positive and negative pairs. To that end we introduce a new loss function that exploits the relations within the triplets. We compare our approach to recently introduced MatchNet and DeepCompare and demonstrate the advantages of our descriptor in terms of performance, memory footprint and speed i.e. when run in GPU, the extraction time of our 128 dimensional feature is comparable to the fastest available binary descriptors such as BRIEF and ORB.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes