CVMar 29, 2024

SceneTracker: Long-term Scene Flow Estimation Network

arXiv:2403.19924v424 citationsh-index: 2Has CodeIEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This work addresses the need for coherent temporal motion estimation in 3D scenes, particularly for applications like autonomous driving, but it is incremental as it builds on existing scene flow methods by extending them to long-term contexts.

The paper tackles the problem of long-term scene flow estimation by proposing SceneTracker, a network that captures fine-grained and long-term 3D motion online, showing superior capabilities in handling 3D spatial occlusion and depth noise interference, with experiments on a new real-world dataset LSFDriving demonstrating its generalization abilities.

Considering that scene flow estimation has the capability of the spatial domain to focus but lacks the coherence of the temporal domain, this study proposes long-term scene flow estimation (LSFE), a comprehensive task that can simultaneously capture the fine-grained and long-term 3D motion in an online manner. We introduce SceneTracker, the first LSFE network that adopts an iterative approach to approximate the optimal 3D trajectory. The network dynamically and simultaneously indexes and constructs appearance correlation and depth residual features. Transformers are then employed to explore and utilize long-range connections within and between trajectories. With detailed experiments, SceneTracker shows superior capabilities in addressing 3D spatial occlusion and depth noise interference, highly tailored to the needs of the LSFE task. We build a real-world evaluation dataset, LSFDriving, for the LSFE field and use it in experiments to further demonstrate the advantage of SceneTracker in generalization abilities. The code and data are available at https://github.com/wwsource/SceneTracker.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes