A benchmark with decomposed distribution shifts for 360 monocular depth estimation
This provides a more nuanced evaluation framework for researchers in computer vision working on monocular depth estimation, though it is incremental as it builds on existing distribution shift analysis.
The authors tackled the problem of evaluating monocular depth estimation models under realistic distribution shifts by creating a benchmark that decomposes uncontrolled in-the-wild shifts into three distinct types: covariate, prior, and concept shifts. They demonstrated that these shifts present unique challenges, with combined shifts causing increased performance drops that standard methods cannot address uniformly.
In this work we contribute a distribution shift benchmark for a computer vision task; monocular depth estimation. Our differentiation is the decomposition of the wider distribution shift of uncontrolled testing on in-the-wild data, to three distinct distribution shifts. Specifically, we generate data via synthesis and analyze them to produce covariate (color input), prior (depth output) and concept (their relationship) distribution shifts. We also synthesize combinations and show how each one is indeed a different challenge to address, as stacking them produces increased performance drops and cannot be addressed horizontally using standard approaches.