CVApr 22, 2019

FishNet: A Camera Localizer using Deep Recurrent Networks

arXiv:1904.09722v1
Originality Incremental advance
AI Analysis

This work addresses robust camera localization for applications like robotics or autonomous systems, but it appears incremental as it builds on existing deep learning and LSTM methods for pose estimation.

The paper tackles camera pose estimation by proposing FishNet, a deep recurrent network architecture that extracts temporal and spatial information using LSTM with a pose regularization term, achieving smoother and more accurate 6-DOF localization on three benchmark datasets.

This paper proposes a robust localization system that employs deep learning for better scene representation, and enhances the accuracy of 6-DOF camera pose estimation. Inspired by the fact that global scene structure can be revealed by wide field-of-view, we leverage the large overlap of a fisheye camera between adjacent frames, and the powerful high-level feature representations of deep learning. Our main contribution is the novel network architecture that extracts both temporal and spatial information using a Recurrent Neural Network. Specifically, we propose a novel pose regularization term combined with LSTM. This leads to smoother pose estimation, especially for large outdoor scenery. Promising experimental results on three benchmark datasets manifest the effectiveness of the proposed approach.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes