CVJan 8, 2024

NeRFmentation: NeRF-based Augmentation for Monocular Depth Estimation

arXiv:2401.03771v24 citationsh-index: 28
Originality Synthesis-oriented
AI Analysis

This addresses data scarcity and diversity issues for autonomous driving depth estimation models, but is an incremental improvement using existing methods on new data.

The paper tackles the problem of limited and linearly-captured training data for monocular depth estimation in autonomous driving by proposing NeRFmentation, a NeRF-based data augmentation pipeline that generates synthetic RGB-D images from new viewing directions; applying it to three state-of-the-art MDE architectures on KITTI improved performance on the original test set, a separate driving dataset, and a synthetic test set.

The capabilities of monocular depth estimation (MDE) models are limited by the availability of sufficient and diverse datasets. In the case of MDE models for autonomous driving, this issue is exacerbated by the linearity of the captured data trajectories. We propose a NeRF-based data augmentation pipeline to introduce synthetic data with more diverse viewing directions into training datasets and demonstrate the benefits of our approach to model performance and robustness. Our data augmentation pipeline, which we call \textit{NeRFmentation}, trains NeRFs on each scene in a dataset, filters out subpar NeRFs based on relevant metrics, and uses them to generate synthetic RGB-D images captured from new viewing directions. In this work, we apply our technique in conjunction with three state-of-the-art MDE architectures on the popular autonomous driving dataset, KITTI, augmenting its training set of the Eigen split. We evaluate the resulting performance gain on the original test set, a separate popular driving dataset, and our own synthetic test set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes