CVDec 9, 2017

SPP-Net: Deep Absolute Pose Regression with Synthetic Views

arXiv:1712.03452v1182 citations
Originality Highly original
AI Analysis

This work addresses the accuracy gap between CNN-based and geometry-based methods for pose estimation, offering a more efficient and scalable solution for applications like robotics and augmented reality.

The paper tackles the problem of image-based localization for robotics and AR by proposing a deep neural network that uses sparse feature descriptors and synthetic viewpoint augmentation to estimate absolute pose, achieving state-of-the-art performance with improved generalization to unseen poses.

Image based localization is one of the important problems in computer vision due to its wide applicability in robotics, augmented reality, and autonomous systems. There is a rich set of methods described in the literature how to geometrically register a 2D image w.r.t.\ a 3D model. Recently, methods based on deep (and convolutional) feedforward networks (CNNs) became popular for pose regression. However, these CNN-based methods are still less accurate than geometry based methods despite being fast and memory efficient. In this work we design a deep neural network architecture based on sparse feature descriptors to estimate the absolute pose of an image. Our choice of using sparse feature descriptors has two major advantages: first, our network is significantly smaller than the CNNs proposed in the literature for this task---thereby making our approach more efficient and scalable. Second---and more importantly---, usage of sparse features allows to augment the training data with synthetic viewpoints, which leads to substantial improvements in the generalization performance to unseen poses. Thus, our proposed method aims to combine the best of the two worlds---feature-based localization and CNN-based pose regression--to achieve state-of-the-art performance in the absolute pose estimation. A detailed analysis of the proposed architecture and a rigorous evaluation on the existing datasets are provided to support our method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes