CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
This addresses a practical issue for computer vision applications where camera changes require costly new datasets, offering a domain-specific solution.
The paper tackles the problem of single-view depth estimation networks failing to generalize across different camera models, proposing camera-aware convolutions that incorporate camera parameters to learn calibration-aware patterns, resulting in improved generalization and outperforming state-of-the-art methods when train and test images use different cameras.
Single-view depth estimation suffers from the problem that a network trained on images from one camera does not generalize to images taken with a different camera model. Thus, changing the camera model requires collecting an entirely new training dataset. In this work, we propose a new type of convolution that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns. Experiments confirm that this improves the generalization capabilities of depth prediction networks considerably, and clearly outperforms the state of the art when the train and test images are acquired with different cameras.