CVLGIVAug 4, 2019

Adversarial View-Consistent Learning for Monocular Depth Estimation

arXiv:1908.01301v1
AI Analysis

This work addresses monocular depth estimation for computer vision applications, presenting an incremental improvement by incorporating multi-view consistency.

The paper tackled the problem of monocular depth estimation by addressing sub-optimal solutions from ignoring geometry, proposing an adversarial view-consistent learning framework that forces depth maps to be reasonable from multiple views, and achieved promising performance gain on the NYU Depth V2 dataset.

This paper addresses the problem of Monocular Depth Estimation (MDE). Existing approaches on MDE usually model it as a pixel-level regression problem, ignoring the underlying geometry property. We empirically find this may result in sub-optimal solution: while the predicted depth map presents small loss value in one specific view, it may exhibit large loss if viewed in different directions. In this paper, inspired by multi-view stereo (MVS), we propose an Adversarial View-Consistent Learning (AVCL) framework to force the estimated depth map to be all reasonable viewed from multiple views. To this end, we first design a differentiable depth map warping operation, which is end-to-end trainable, and then propose a pose generator to generate novel views for a given image in an adversarial manner. Collaborating with the differentiable depth map warping operation, the pose generator encourages the depth estimation network to learn from hard views, hence produce view-consistent depth maps . We evaluate our method on NYU Depth V2 dataset and the experimental results show promising performance gain upon state-of-the-art MDE approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes