Rethinking Neural Networks With Benford's Law
This work proposes a novel early stopping mechanism for neural networks, which could benefit practitioners by reducing the need for a validation set and potentially improving generalization.
The authors propose a new metric, Model Enthalpy (MLH), based on Benford's Law, which measures the closeness of neural network parameters to this law. They demonstrate that MLH is a strong predictor of validation accuracy and can be used for early stopping, potentially outperforming traditional early stopping with a validation set.
Benford's Law (BL) or the Significant Digit Law defines the probability distribution of the first digit of numerical values in a data sample. This Law is observed in many naturally occurring datasets. It can be seen as a measure of naturalness of a given distribution and finds its application in areas like anomaly and fraud detection. In this work, we address the following question: Is the distribution of the Neural Network parameters related to the network's generalization capability? To that end, we first define a metric, MLH (Model Enthalpy), that measures the closeness of a set of numbers to Benford's Law and we show empirically that it is a strong predictor of Validation Accuracy. Second, we use MLH as an alternative to Validation Accuracy for Early Stopping, removing the need for a Validation set. We provide experimental evidence that even if the optimal size of the validation set is known before-hand, the peak test accuracy attained is lower than not using a validation set at all. Finally, we investigate the connection of BL to Free Energy Principle and First Law of Thermodynamics, showing that MLH is a component of the internal energy of the learning system and optimization as an analogy to minimizing the total energy to attain equilibrium.