ROAICVNov 28, 2024

Visual SLAMMOT Considering Multiple Motion Models

arXiv:2411.19134v22 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the challenge of dynamic outdoor scenarios for autonomous vehicles by tightly coupling SLAM and MOT, but it is incremental as it extends a previous LiDAR-based method to the visual domain.

The paper tackles the problem of integrating Simultaneous Localization and Mapping (SLAM) with Multi-Object Tracking (MOT) in autonomous driving by proposing a visual SLAMMOT method that considers multiple motion models, demonstrating its feasibility and advantages in bridging LiDAR and vision-based sensing.

Simultaneous Localization and Mapping (SLAM) and Multi-Object Tracking (MOT) are pivotal tasks in the realm of autonomous driving, attracting considerable research attention. While SLAM endeavors to generate real-time maps and determine the vehicle's pose in unfamiliar settings, MOT focuses on the real-time identification and tracking of multiple dynamic objects. Despite their importance, the prevalent approach treats SLAM and MOT as independent modules within an autonomous vehicle system, leading to inherent limitations. Classical SLAM methodologies often rely on a static environment assumption, suitable for indoor rather than dynamic outdoor scenarios. Conversely, conventional MOT techniques typically rely on the vehicle's known state, constraining the accuracy of object state estimations based on this prior. To address these challenges, previous efforts introduced the unified SLAMMOT paradigm, yet primarily focused on simplistic motion patterns. In our team's previous work IMM-SLAMMOT\cite{IMM-SLAMMOT}, we present a novel methodology incorporating consideration of multiple motion models into SLAMMOT i.e. tightly coupled SLAM and MOT, demonstrating its efficacy in LiDAR-based systems. This paper studies feasibility and advantages of instantiating this methodology as visual SLAMMOT, bridging the gap between LiDAR and vision-based sensing mechanisms. Specifically, we propose a solution of visual SLAMMOT considering multiple motion models and validate the inherent advantages of IMM-SLAMMOT in the visual domain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes