CV LG NENov 21, 2014

Understanding image representations by measuring their equivariance and equivalence

arXiv:1411.5908v2620 citations

Originality Incremental advance

AI Analysis

This work addresses a foundational gap in machine learning by providing tools to analyze and compare image representations, which is incremental but important for advancing theoretical understanding and practical applications like structured-output regression.

The paper tackled the limited theoretical understanding of image representations by investigating their equivariance, invariance, and equivalence properties, proposing empirical methods to measure these and applying them to popular representations like CNNs to reveal structural insights, such as identifying layers where geometric invariances occur.

Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aiming at filling this gap, we investigate three key mathematical properties of representations: equivariance, invariance, and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parametrisations of a CNN, capture the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

View on arXiv PDF

Similar