CVAug 31, 2019

Towards Learning Affine-Invariant Representations via Data-Efficient CNNs

arXiv:1909.00114v124 citations
Originality Incremental advance
AI Analysis

This addresses the problem of limited labeled data for robust object recognition in computer vision, though it is an incremental improvement by integrating prior knowledge into CNNs.

The paper tackled learning affine-invariant object representations to improve data-efficiency and robustness in CNNs, achieving 84.15% test accuracy on the Traffic Sign dataset with only 10 images per class, outperforming state-of-the-art by 29.80%.

In this paper we propose integrating a priori knowledge into both design and training of convolutional neural networks (CNNs) to learn object representations that are invariant to affine transformations (i.e., translation, scale, rotation). Accordingly we propose a novel multi-scale maxout CNN and train it end-to-end with a novel rotation-invariant regularizer. This regularizer aims to enforce the weights in each 2D spatial filter to approximate circular patterns. In this way, we manage to handle affine transformations in training using convolution, multi-scale maxout, and circular filters. Empirically we demonstrate that such knowledge can significantly improve the data-efficiency as well as generalization and robustness of learned models. For instance, on the Traffic Sign data set and trained with only 10 images per class, our method can achieve 84.15% that outperforms the state-of-the-art by 29.80% in terms of test accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes