CVJul 8, 2020

Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias

arXiv:2007.03887v26 citations
AI Analysis

This addresses a reliability issue in computer vision for applications like autonomous driving, but it is incremental as it builds on existing depth prediction methods.

The paper tackles the problem of monocular depth predictors performing poorly on images with uncommon camera poses due to training data bias, and shows that using perspective-aware data augmentation and a conditional model based on camera pose improves depth prediction accuracy, with specific gains such as better generalization to real images.

Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. As a result, trained predictors fail to make reliable depth predictions for testing examples captured under uncommon camera poses. To address this issue, we propose two novel techniques that exploit the camera pose during training and prediction. First, we introduce a simple perspective-aware data augmentation that synthesizes new training examples with more diverse views by perturbing the existing ones in a geometrically consistent manner. Second, we propose a conditional model that exploits the per-image camera pose as prior knowledge by encoding it as a part of the input. We show that jointly applying the two methods improves depth prediction on images captured under uncommon and even never-before-seen camera poses. We show that our methods improve performance when applied to a range of different predictor architectures. Lastly, we show that explicitly encoding the camera pose distribution improves the generalization performance of a synthetically trained depth predictor when evaluated on real images.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes