Degrees of Freedom in Deep Neural Networks
This provides insights into model complexity and regularization for researchers in machine learning, though it is incremental as it extends existing statistical concepts to deep networks.
The paper tackles the problem of understanding model complexity in deep sigmoidal neural networks by relating degrees of freedom to expected optimism, showing that degrees of freedom are much lower than parameter counts, sometimes by orders of magnitude in real datasets, and that deeper networks have fewer degrees of freedom for a fixed parameter count.
In this paper, we explore degrees of freedom in deep sigmoidal neural networks. We show that the degrees of freedom in these models is related to the expected optimism, which is the expected difference between test error and training error. We provide an efficient Monte-Carlo method to estimate the degrees of freedom for multi-class classification methods. We show degrees of freedom are lower than the parameter count in a simple XOR network. We extend these results to neural nets trained on synthetic and real data, and investigate impact of network's architecture and different regularization choices. The degrees of freedom in deep networks are dramatically smaller than the number of parameters, in some real datasets several orders of magnitude. Further, we observe that for fixed number of parameters, deeper networks have less degrees of freedom exhibiting a regularization-by-depth.