Edge-Guided Occlusion Fading Reduction for a Light-Weighted Self-Supervised Monocular Depth Estimation
This work addresses occlusion fading for self-supervised monocular depth estimation, which is an incremental improvement in computer vision for applications like autonomous driving.
The paper tackles occlusion fading in self-supervised monocular depth estimation by proposing an Edge-Guided post-processing method and integrating Atrous Spatial Pyramid Pooling (ASPP) into the network, resulting in a lighter network with 8.1 million parameters achieving up to 40 FPS and outperforming state-of-the-art on KITTI benchmarks.
Self-supervised monocular depth estimation methods generally suffer the occlusion fading issue due to the lack of supervision by the per pixel ground truth. Although a post-processing method was proposed by Godard et. al. to reduce the occlusion fading, the compensated results have a severe halo effect. In this paper, we propose a novel Edge-Guided post-processing to reduce the occlusion fading issue for self-supervised monocular depth estimation. We further introduce Atrous Spatial Pyramid Pooling (ASPP) into the network to reduce the computational costs and improve the inference performance. The proposed ASPP-based network is lighter, faster, and better than current commonly used depth estimation networks. This light-weight network only needs 8.1 million parameters and can achieve up to 40 frames per second for $256\times512$ input in the inference stage using a single nVIDIA GTX1080 GPU. The proposed network also outperforms the current state-of-the-art on the KITTI benchmarks. The ASPP-based network and Edge-Guided post-processing produce better results either quantitatively and qualitatively than the competitors.