LGCVMLJan 14, 2018

Fix your classifier: the marginal value of training the last weight layer

arXiv:1801.04540v2105 citations
AI Analysis

This addresses resource efficiency for practitioners deploying neural networks in classification tasks, though it is incremental as it builds on existing classifier designs.

The paper tackles the problem of reducing the parameter count and computational cost of the final classifier layer in neural networks by fixing it up to a global scale constant, showing little or no accuracy loss for most tasks, with initialization using a Hadamard matrix to speed up inference.

Neural networks are commonly used as models for classification for a wide variety of tasks. Typically, a learned affine transformation is placed at the end of such models, yielding a per-class value used for classification. This classifier can have a vast number of parameters, which grows linearly with the number of possible classes, thus requiring increasingly more resources. In this work we argue that this classifier can be fixed, up to a global scale constant, with little or no loss of accuracy for most tasks, allowing memory and computational benefits. Moreover, we show that by initializing the classifier with a Hadamard matrix we can speed up inference as well. We discuss the implications for current understanding of neural network models.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes