CVLGNov 11, 2022

Depth and Representation in Vision Models

arXiv:2211.06496v33 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work provides insights into representation learning in vision models, showing that classification tasks inherently involve input generation, which is incremental but clarifies fundamental model behavior.

The study investigated how deep learning models represent input images across layers, finding that deeper layers lose accuracy in reconstructing the original input due to non-unique embeddings, and that training for classification increases clarity only in early layers while late layers form abstract representations.

Deep learning models develop successive representations of their input in sequential layers, the last of which maps the final representation to the output. Here we investigate the informational content of these representations by observing the ability of convolutional image classification models to autoencode the model's input using embeddings existing in various layers. We find that the deeper the layer, the less accurate that layer's representation of the input is before training. Inaccurate representation results from non-uniqueness in which various distinct inputs give approximately the same embedding. Non-unique representation is a consequence of both exact and approximate non-invertibility of transformations present in the forward pass. Learning to classify natural images leads to an increase in representation clarity for early but not late layers, which instead form abstract images. Rather than simply selecting for features present in the input necessary for classification, deep layer representations are found to transform the input so that it matches representations of the training data such that arbitrary inputs are mapped to manifolds learned during training. This work provides support for the theory that the tasks of image recognition and input generation are inseparable even for models trained exclusively to classify.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes