CVAINov 17, 2020

Learning Canonical Transformations

arXiv:2011.08822v11 citations
AI Analysis

This work addresses the problem of enabling neural networks to learn generalizable geometric transformations, which is a foundational problem for improving generalization in computer vision models.

This paper explores inductive biases for neural networks to learn canonical geometric transformations like translation and rotation in pixel space. They found that high training set diversity allows translation to extrapolate to unseen shapes and scales, and an iterative training scheme significantly extrapolates rotation over time.

Humans understand a set of canonical geometric transformations (such as translation and rotation) that support generalization by being untethered to any specific object. We explore inductive biases that help a neural network model learn these transformations in pixel space in a way that can generalize out-of-domain. Specifically, we find that high training set diversity is sufficient for the extrapolation of translation to unseen shapes and scales, and that an iterative training scheme achieves significant extrapolation of rotation in time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes