CV ROOct 15, 2020

Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM

arXiv:2010.07646v14.236 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses robustness issues in vision-based localization and mapping for autonomous systems in dynamic settings, representing an incremental improvement over existing methods.

The paper tackles the problem of dynamic objects degrading visual SLAM in urban environments by presenting a deep learning framework that removes dynamic content and inpaints static backgrounds, resulting in improved performance on visual odometry, place recognition, and multi-view stereo tasks.

In this paper we present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera. The general objective is to improve vision-based localization and mapping tasks in dynamic environments, where the presence (or absence) of different dynamic objects in different moments makes these tasks less robust. We introduce an end-to-end deep learning framework to turn images of an urban environment that include dynamic content, such as vehicles or pedestrians, into realistic static frames suitable for localization and mapping. This objective faces two main challenges: detecting the dynamic objects, and inpainting the static occluded back-ground. The first challenge is addressed by the use of a convolutional network that learns a multi-class semantic segmentation of the image. The second challenge is approached with a generative adversarial model that, taking as input the original dynamic image and the computed dynamic/static binary mask, is capable of generating the final static image. This framework makes use of two new losses, one based on image steganalysis techniques, useful to improve the inpainting quality, and another one based on ORB features, designed to enhance feature matching between real and hallucinated image regions. To validate our approach, we perform an extensive evaluation on different tasks that are affected by dynamic entities, i.e., visual odometry, place recognition and multi-view stereo, with the hallucinated images. Code has been made available on https://github.com/bertabescos/EmptyCities_SLAM.

View on arXiv PDF Code

Similar