CVAug 22, 2017

Representation Learning by Learning to Count

arXiv:1708.06734v1379 citations
Originality Highly original
AI Analysis

This method addresses the problem of learning visual representations without labeled data, offering a novel approach that is incremental in its application to existing benchmarks.

The paper tackles representation learning by using an artificial supervision signal based on counting visual primitives, which does not require manual annotation, and results in representations that perform on par or exceed state-of-the-art in transfer learning benchmarks.

We introduce a novel method for representation learning that uses an artificial supervision signal based on counting visual primitives. This supervision signal is obtained from an equivariance relation, which does not require any manual annotation. We relate transformations of images to transformations of the representations. More specifically, we look for the representation that satisfies such relation rather than the transformations that match a given representation. In this paper, we use two image transformations in the context of counting: scaling and tiling. The first transformation exploits the fact that the number of visual primitives should be invariant to scale. The second transformation allows us to equate the total number of visual primitives in each tile to that in the whole image. These two transformations are combined in one constraint and used to train a neural network with a contrastive loss. The proposed task produces representations that perform on par or exceed the state of the art in transfer learning benchmarks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes