A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-based Semantic Scene Understanding
It addresses the need for more reliable SLAM systems in robotics, but is incremental as it reviews existing methods rather than introducing new ones.
This review paper tackles the problem of Visual-SLAM for autonomous mobile robots by summarizing advancements from geometric modeling to learning-based semantic scene understanding, highlighting how deep learning techniques provide data-driven approaches to improve robustness in challenging environments.
Simultaneous Localisation and Mapping (SLAM) is one of the fundamental problems in autonomous mobile robots where a robot needs to reconstruct a previously unseen environment while simultaneously localising itself with respect to the map. In particular, Visual-SLAM uses various sensors from the mobile robot for collecting and sensing a representation of the map. Traditionally, geometric model-based techniques were used to tackle the SLAM problem, which tends to be error-prone under challenging environments. Recent advancements in computer vision, such as deep learning techniques, have provided a data-driven approach to tackle the Visual-SLAM problem. This review summarises recent advancements in the Visual-SLAM domain using various learning-based methods. We begin by providing a concise overview of the geometric model-based approaches, followed by technical reviews on the current paradigms in SLAM. Then, we present the various learning-based approaches to collecting sensory inputs from mobile robots and performing scene understanding. The current paradigms in deep-learning-based semantic understanding are discussed and placed under the context of Visual-SLAM. Finally, we discuss challenges and further opportunities in the direction of learning-based approaches in Visual-SLAM.