Understanding Deep Learning via Decision Boundary
This work addresses the fundamental problem of generalization in deep learning for researchers, offering new metrics and bounds, but it is incremental as it builds on existing decision boundary concepts.
The paper tackles the problem of understanding deep learning generalization by linking it to decision boundary variability, showing that lower variability correlates with better generalization in experiments. It provides theoretical bounds that do not explicitly depend on sample size or network size, with an upper bound of order O(1/√m + ε + η log(1/η)).
This paper discovers that the neural network with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and $(ε, η)$-data DB variability, are proposed to measure the decision boundary variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the decision boundary variability and the generalizability. From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size. We also prove an upper bound of order $\mathcal{O}\left(\frac{1}{\sqrt{m}}+ε+η\log\frac{1}η\right)$ based on data DB variability. The bound is convenient to estimate without the requirement of labels, and does not explicitly depend on the network size which is usually prohibitively large in deep learning.