AutoDispNet: Improving Disparity Estimation With AutoML
This work addresses the challenge of efficiently optimizing complex architectures for computer vision tasks, providing a domain-specific improvement for disparity estimation.
The paper tackled the problem of optimizing large-scale U-Net-like encoder-decoder architectures for disparity estimation using AutoML, achieving state-of-the-art performance that clearly outperforms manually optimized baselines.
Much research work in computer vision is being spent on optimizing existing network architectures to obtain a few more percentage points on benchmarks. Recent AutoML approaches promise to relieve us from this effort. However, they are mainly designed for comparatively small-scale classification tasks. In this work, we show how to use and extend existing AutoML techniques to efficiently optimize large-scale U-Net-like encoder-decoder architectures. In particular, we leverage gradient-based neural architecture search and Bayesian optimization for hyperparameter search. The resulting optimization does not require a large-scale compute cluster. We show results on disparity estimation that clearly outperform the manually optimized baseline and reach state-of-the-art performance.