PAC-Bayesian Generalization Bounds for MultiLayer Perceptrons
This provides theoretical guarantees for generalization in neural networks, which is a foundational problem in machine learning, though it appears incremental as it builds on existing PAC-Bayesian and variational inference frameworks.
The paper tackles the problem of establishing probabilistic foundations for PAC-Bayesian generalization bounds in Multilayer Perceptrons (MLPs) with cross-entropy loss, showing that minimizing these bounds is equivalent to maximizing the Evidence Lower Bound (ELBO) and validating the bounds on benchmark datasets.
We study PAC-Bayesian generalization bounds for Multilayer Perceptrons (MLPs) with the cross entropy loss. Above all, we introduce probabilistic explanations for MLPs in two aspects: (i) MLPs formulate a family of Gibbs distributions, and (ii) minimizing the cross-entropy loss for MLPs is equivalent to Bayesian variational inference, which establish a solid probabilistic foundation for studying PAC-Bayesian bounds on MLPs. Furthermore, based on the Evidence Lower Bound (ELBO), we prove that MLPs with the cross entropy loss inherently guarantee PAC- Bayesian generalization bounds, and minimizing PAC-Bayesian generalization bounds for MLPs is equivalent to maximizing the ELBO. Finally, we validate the proposed PAC-Bayesian generalization bound on benchmark datasets.