Over-Sampling in a Deep Neural Network
This provides a theoretical explanation for scalability in deep learning, potentially impacting all of ML/AI, though it is incremental as it builds on existing sampling theory.
The paper tackled the problem of understanding why bigger deep neural networks perform better by interpreting them as discrete systems subject to sampling theory, and demonstrated that over-sampled networks are more selective, learn faster, and learn more robustly.
Deep neural networks (DNN) are the state of the art on many engineering problems such as computer vision and audition. A key factor in the success of the DNN is scalability - bigger networks work better. However, the reason for this scalability is not yet well understood. Here, we interpret the DNN as a discrete system, of linear filters followed by nonlinear activations, that is subject to the laws of sampling theory. In this context, we demonstrate that over-sampled networks are more selective, learn faster and learn more robustly. Our findings may ultimately generalize to the human brain.