CVFeb 21, 2024

Scene Prior Filtering for Depth Super-Resolution

arXiv:2402.13876v33 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses depth super-resolution for applications like robotics or AR/VR, but it is incremental as it builds on guided filtering methods with new priors.

The paper tackles the problem of texture interference and edge inaccuracy in depth super-resolution by introducing a Scene Prior Filtering network (SPFNet) that uses priors like surface normal and semantic maps, achieving state-of-the-art performance on real and synthetic datasets.

Multi-modal fusion is vital to the success of super-resolution of depth maps. However, commonly used fusion strategies, such as addition and concatenation, fall short of effectively bridging the modal gap. As a result, guided image filtering methods have been introduced to mitigate this issue. Nevertheless, it is observed that their filter kernels usually encounter significant texture interference and edge inaccuracy. To tackle these two challenges, we introduce a Scene Prior Filtering network, SPFNet, which utilizes the priors surface normal and semantic map from large-scale models. Specifically, we design an All-in-one Prior Propagation that computes the similarity between multi-modal scene priors, i.e., RGB, normal, semantic, and depth, to reduce the texture interference. In addition, we present a One-to-one Prior Embedding that continuously embeds each single-modal prior into depth using Mutual Guided Filtering, further alleviating the texture interference while enhancing edges. Our SPFNet has been extensively evaluated on both real and synthetic datasets, achieving state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes