Real-Time 6DOF Pose Relocalization for Event Cameras with Stacked Spatial LSTM Networks
This addresses the need for accurate and efficient pose estimation in robotics and computer vision, representing a strong specific gain rather than a foundational advancement.
The paper tackles the problem of real-time 6DOF pose relocalization for event cameras using a Stacked Spatial LSTM Network, achieving a reduction in position error by approximately 6 times and orientation error by 3 times compared to the state of the art.
We present a new method to relocalize the 6DOF pose of an event camera solely based on the event stream. Our method first creates the event image from a list of events that occurs in a very short time interval, then a Stacked Spatial LSTM Network (SP-LSTM) is used to learn the camera pose. Our SP-LSTM is composed of a CNN to learn deep features from the event images and a stack of LSTM to learn spatial dependencies in the image feature space. We show that the spatial dependency plays an important role in the relocalization task and the SP-LSTM can effectively learn this information. The experimental results on a publicly available dataset show that our approach generalizes well and outperforms recent methods by a substantial margin. Overall, our proposed method reduces by approx. 6 times the position error and 3 times the orientation error compared to the current state of the art. The source code and trained models will be released.