CV ROSep 1, 2020

Bidirectional Attention Network for Monocular Depth Estimation

Shubhra Aich, Jean Marie Uwabeza Vianney, Md Amirul Islam, Mannat Kaur, Bingbing Liu

arXiv:2009.00743v219.888 citations

Originality Incremental advance

AI Analysis

This work addresses a key limitation in convolutional neural networks for depth estimation, offering a more efficient solution for applications like autonomous driving and robotics.

The paper tackles the problem of monocular depth estimation by proposing a Bidirectional Attention Network (BANet) to better integrate local and global information, achieving performance on par with or better than state-of-the-art methods on KITTI and DIODE datasets with reduced memory and computational complexity.

In this paper, we propose a Bidirectional Attention Network (BANet), an end-to-end framework for monocular depth estimation (MDE) that addresses the limitation of effectively integrating local and global information in convolutional neural networks. The structure of this mechanism derives from a strong conceptual foundation of neural machine translation, and presents a light-weight mechanism for adaptive control of computation similar to the dynamic nature of recurrent neural networks. We introduce bidirectional attention modules that utilize the feed-forward feature maps and incorporate the global context to filter out ambiguity. Extensive experiments reveal the high degree of capability of this bidirectional attention model over feed-forward baselines and other state-of-the-art methods for monocular depth estimation on two challenging datasets -- KITTI and DIODE. We show that our proposed approach either outperforms or performs at least on a par with the state-of-the-art monocular depth estimation methods with less memory and computational complexity.

View on arXiv PDF

Similar