CVApr 13, 2021

Learning Multi-modal Information for Robust Light Field Depth Estimation

Yongri Piao, Xinxin Ji, Miao Zhang, Yukun Zhang

arXiv:2104.05971v15.64 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses robust depth estimation for light field imaging, which is incremental as it improves upon existing focal stack-based methods by incorporating multi-modal information.

The paper tackles the problem of suboptimal depth estimation from light field focal stacks due to defocus blur by proposing a multi-modal learning method that extracts contextual information from both focal stacks and RGB images, then fuses them with an attention-guided module. The method achieves superior performance compared to existing methods on two light field datasets, with visual results demonstrating applicability to mobile phone data.

Light field data has been demonstrated to facilitate the depth estimation task. Most learning-based methods estimate the depth infor-mation from EPI or sub-aperture images, while less methods pay attention to the focal stack. Existing learning-based depth estimation methods from the focal stack lead to suboptimal performance because of the defocus blur. In this paper, we propose a multi-modal learning method for robust light field depth estimation. We first excavate the internal spatial correlation by designing a context reasoning unit which separately extracts comprehensive contextual information from the focal stack and RGB images. Then we integrate the contextual information by exploiting a attention-guide cross-modal fusion module. Extensive experiments demonstrate that our method achieves superior performance than existing representative methods on two light field datasets. Moreover, visual results on a mobile phone dataset show that our method can be widely used in daily life.

View on arXiv PDF Code

Similar