CVAISep 17, 2023

Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth Estimation

arXiv:2309.09272v29 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency needs for robotics and autonomous driving applications, offering an incremental improvement in model design.

The paper tackles the problem of high computational cost in self-supervised monocular depth estimation by proposing a lightweight fully convolutional network with contextual feature fusion and channel attention, achieving better results than larger models like Monodepth2 with only 30 parameters on the KITTI benchmark.

With the frequent use of self-supervised monocular depth estimation in robotics and autonomous driving, the model's efficiency is becoming increasingly important. Most current approaches apply much larger and more complex networks to improve the precision of depth estimation. Some researchers incorporated Transformer into self-supervised monocular depth estimation to achieve better performance. However, this method leads to high parameters and high computation. We present a fully convolutional depth estimation network using contextual feature fusion. Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects instead of long-range fusion. We further promote depth estimation results employing lightweight channel attention based on convolution in the decoder stage. Our method reduces the parameters without sacrificing accuracy. Experiments on the KITTI benchmark show that our method can get better results than many large models, such as Monodepth2, with only 30 parameters. The source code is available at https://github.com/boyagesmile/DNA-Depth.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes