CVSep 28, 2023

Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors

Yuanyi Zhong, Anand Bhattad, Yu-Xiong Wang, David Forsyth

arXiv:2309.16646v25.93 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses a fundamental property issue in computer vision models for depth and normal prediction, with incremental improvements to existing methods.

The paper tackled the problem that state-of-the-art depth and normal predictors lack equivariance to cropping-and-resizing, even with data augmentation, and proposed an equivariant regularization technique that improves supervised and semi-supervised learning performance on Taskonomy tasks and accuracy on NYU-v2.

Dense depth and surface normal predictors should possess the equivariant property to cropping-and-resizing -- cropping the input image should result in cropping the same output image. However, we find that state-of-the-art depth and normal predictors, despite having strong performances, surprisingly do not respect equivariance. The problem exists even when crop-and-resize data augmentation is employed during training. To remedy this, we propose an equivariant regularization technique, consisting of an averaging procedure and a self-consistency loss, to explicitly promote cropping-and-resizing equivariance in depth and normal networks. Our approach can be applied to both CNN and Transformer architectures, does not incur extra cost during testing, and notably improves the supervised and semi-supervised learning performance of dense predictors on Taskonomy tasks. Finally, finetuning with our loss on unlabeled images improves not only equivariance but also accuracy of state-of-the-art depth and normal predictors when evaluated on NYU-v2. GitHub link: https://github.com/mikuhatsune/equivariance

View on arXiv PDF Code

Similar