Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs
This work provides a theoretical framework for generalization in neural networks, addressing a foundational issue in machine learning, though it appears incremental by extending existing bounds to more practical scenarios.
The authors tackled the problem of understanding why high-rank neural networks generalize well by deriving a new Rademacher complexity bound using Koopman operators, group representations, and RKHSs, which applies to a wider range of realistic models compared to existing limited bounds.
We derive a new Rademacher complexity bound for deep neural networks using Koopman operators, group representations, and reproducing kernel Hilbert spaces (RKHSs). The proposed bound describes why the models with high-rank weight matrices generalize well. Although there are existing bounds that attempt to describe this phenomenon, these existing bounds can be applied to limited types of models. We introduce an algebraic representation of neural networks and a kernel function to construct an RKHS to derive a bound for a wider range of realistic models. This work paves the way for the Koopman-based theory for Rademacher complexity bounds to be valid for more practical situations.