CVLGMLJun 18, 2012

On multi-view feature learning

arXiv:1206.4609v158 citations
Originality Synthesis-oriented
AI Analysis

This work addresses feature learning for object recognition in multi-view scenarios, providing theoretical insights that are incremental to existing sparse coding approaches.

The paper tackles the problem of learning features from multi-view data, such as spatio-temporal or binocular observations, by analyzing how hidden variables encode transformations like rotation angles in shared eigenspaces, explaining experimental results where transformation-specific or invariant features emerge in complex cell models trained on videos.

Sparse coding is a common approach to learning local features for object recognition. Recently, there has been an increasing interest in learning features from spatio-temporal, binocular, or other multi-observation data, where the goal is to encode the relationship between images rather than the content of a single image. We provide an analysis of multi-view feature learning, which shows that hidden variables encode transformations by detecting rotation angles in the eigenspaces shared among multiple image warps. Our analysis helps explain recent experimental results showing that transformation-specific features emerge when training complex cell models on videos. Our analysis also shows that transformation-invariant features can emerge as a by-product of learning representations of transformations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes