D$^3$FlowSLAM: Self-Supervised Dynamic SLAM with Flow Motion Decomposition and DINO Guidance
This addresses the challenge of accurate localization and mapping in dynamic environments for robotics and autonomous systems, representing a novel method for a known bottleneck.
The paper tackles the problem of robust SLAM in dynamic scenes by introducing a self-supervised method that decomposes motion into static and dynamic flows, achieving superior accuracy compared to other self-supervised methods and matching or surpassing some supervised methods.
In this paper, we introduce a self-supervised deep SLAM method that robustly operates in dynamic scenes while accurately identifying dynamic components. Our method leverages a dual-flow representation for static flow and dynamic flow, facilitating effective scene decomposition in dynamic environments. We propose a dynamic update module based on this representation and develop a dense SLAM system that excels in dynamic scenarios. In addition, we design a self-supervised training scheme using DINO as a prior, enabling label-free training. Our method achieves superior accuracy compared to other self-supervised methods. It also matches or even surpasses the performance of existing supervised methods in some cases. All code and data will be made publicly available upon acceptance.