CVNov 17, 2021

Temporally Consistent Online Depth Estimation in Dynamic Scenes

arXiv:2111.09337v339 citations
AI Analysis

This addresses the challenge of maintaining stable depth over time in dynamic environments for real-time systems, representing an incremental improvement over existing techniques.

The paper tackled the problem of temporally consistent depth estimation in dynamic scenes for online applications like augmented reality, presenting the CODD framework that outperforms competing methods in temporal consistency while matching per-frame accuracy.

Temporally consistent depth estimation is crucial for online applications such as augmented reality. While stereo depth estimation has received substantial attention as a promising way to generate 3D information, there is relatively little work focused on maintaining temporal stability. Indeed, based on our analysis, current techniques still suffer from poor temporal consistency. Stabilizing depth temporally in dynamic scenes is challenging due to concurrent object and camera motion. In an online setting, this process is further aggravated because only past frames are available. We present a framework named Consistent Online Dynamic Depth (CODD) to produce temporally consistent depth estimates in dynamic scenes in an online setting. CODD augments per-frame stereo networks with novel motion and fusion networks. The motion network accounts for dynamics by predicting a per-pixel SE3 transformation and aligning the observations. The fusion network improves temporal depth consistency by aggregating the current and past estimates. We conduct extensive experiments and demonstrate quantitatively and qualitatively that CODD outperforms competing methods in terms of temporal consistency and performs on par in terms of per-frame accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes