Inverting Visual Representations with Convolutional Networks
This work addresses the interpretability of visual representations for researchers in computer vision, though it is incremental as it builds on existing inversion methods with a new neural network approach.
The paper tackled the problem of analyzing and interpreting visual feature representations by proposing an up-convolutional neural network to invert them, achieving significantly better reconstructions for shallow representations like HOG and SIFT than existing methods and revealing rich information in these features.
Feature representations, both hand-designed and learned ones, are often hard to analyze and interpret, even when they are extracted from visual data. We propose a new approach to study image representations by inverting them with an up-convolutional neural network. We apply the method to shallow representations (HOG, SIFT, LBP), as well as to deep networks. For shallow representations our approach provides significantly better reconstructions than existing methods, revealing that there is surprisingly rich information contained in these features. Inverting a deep network trained on ImageNet provides several insights into the properties of the feature representation learned by the network. Most strikingly, the colors and the rough contours of an image can be reconstructed from activations in higher network layers and even from the predicted class probabilities.