Measuring Information Transfer in Neural Networks
This provides a tool for analyzing neural network learning from an information perspective, but it is incremental as it builds on existing prequential coding methods.
The paper tackles the problem of quantifying generalizable information in neural networks by proposing Information Transfer (L_IT), a measure based on prequential coding, and shows it correlates with generalizable information and can be used to analyze datasets, transfer learning, and catastrophic forgetting.
Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity. Recent success of prequential coding on neural networks points to a promising path of deriving an efficient description length of a model. We propose a practical measure of the generalizable information in a neural network model based on prequential coding, which we term Information Transfer ($L_{IT}$). Theoretically, $L_{IT}$ is an estimation of the generalizable part of a model's information content. In experiments, we show that $L_{IT}$ is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset. Consequently, $L_{IT}$ can serve as a useful analysis tool in deep learning. In this paper, we apply $L_{IT}$ to compare and dissect information in datasets, evaluate representation models in transfer learning, and analyze catastrophic forgetting and continual learning algorithms. $L_{IT}$ provides an information perspective which helps us discover new insights into neural network learning.