CVOct 15, 2023

AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation

arXiv:2310.09739v316 citationsh-index: 19Has Code
Originality Highly original
AI Analysis

This work addresses a bottleneck in training pipelines for depth estimation tasks, offering a practical solution to enhance model robustness and accuracy.

The paper tackles the problem of limited data augmentation in unsupervised depth completion and estimation by introducing a method to reverse geometric transformations, enabling the use of previously-infeasible augmentations. This approach consistently improves performance on indoor (VOID) and outdoor (KITTI) datasets, with demonstrated generalization to four other datasets.

Unsupervised depth completion and estimation methods are trained by minimizing reconstruction error. Block artifacts from resampling, intensity saturation, and occlusions are amongst the many undesirable by-products of common data augmentation schemes that affect image reconstruction quality, and thus the training signal. Hence, typical augmentations on images viewed as essential to training pipelines in other vision tasks have seen limited use beyond small image intensity changes and flipping. The sparse depth modality in depth completion have seen even less use as intensity transformations alter the scale of the 3D scene, and geometric transformations may decimate the sparse points during resampling. We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth completion and estimation. This is achieved by reversing, or ``undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame. This enables computing the reconstruction losses using the original images and sparse depth maps, eliminating the pitfalls of naive loss computation on the augmented inputs and allowing us to scale up augmentations to boost performance. We demonstrate our method on indoor (VOID) and outdoor (KITTI) datasets, where we consistently improve upon recent methods across both datasets as well as generalization to four other datasets. Code available at: https://github.com/alexklwong/augundo.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes