Geodesics of learned representations
This work addresses the challenge of understanding and improving invariance properties in neural networks for image processing, offering a tool for visualization and refinement, though it is incremental in building on existing representation analysis methods.
The authors tackled the problem of visualizing and refining invariances in learned representations by developing a method to test for linearization of transformations, finding that a state-of-the-art image classification network failed to linearize geometric transformations like translation and rotation, but they proposed and demonstrated a remedy that enabled linearization.
We develop a new method for visualizing and refining the invariances of learned representations. Specifically, we test for a general form of invariance, linearization, in which the action of a transformation is confined to a low-dimensional subspace. Given two reference images (typically, differing by some transformation), we synthesize a sequence of images lying on a path between them that is of minimal length in the space of the representation (a "representational geodesic"). If the transformation relating the two reference images is linearized by the representation, this sequence should follow the gradual evolution of this transformation. We use this method to assess the invariance properties of a state-of-the-art image classification network and find that geodesics generated for image pairs differing by translation, rotation, and dilation do not evolve according to their associated transformations. Our method also suggests a remedy for these failures, and following this prescription, we show that the modified representation is able to linearize a variety of geometric image transformations.