A Strongly Quasiconvex PAC-Bayesian Bound
This work addresses a theoretical and practical challenge in machine learning for researchers and practitioners using PAC-Bayesian methods, though it is incremental as it builds on existing bounds and optimization techniques.
The authors tackled the problem of tuning the trade-off between complexity and empirical performance in PAC-Bayesian bounds by proposing a new bound that is convex in the posterior and a trade-off parameter, with an alternating minimization procedure guaranteed to converge under certain conditions. They showed that rigorous minimization of this bound is competitive with cross-validation in experiments, with the trade-off often being quasiconvex even when conditions are violated.
We propose a new PAC-Bayesian bound and a way of constructing a hypothesis space, so that the bound is convex in the posterior distribution and also convex in a trade-off parameter between empirical performance of the posterior distribution and its complexity. The complexity is measured by the Kullback-Leibler divergence to a prior. We derive an alternating procedure for minimizing the bound. We show that the bound can be rewritten as a one-dimensional function of the trade-off parameter and provide sufficient conditions under which the function has a single global minimum. When the conditions are satisfied the alternating minimization is guaranteed to converge to the global minimum of the bound. We provide experimental results demonstrating that rigorous minimization of the bound is competitive with cross-validation in tuning the trade-off between complexity and empirical performance. In all our experiments the trade-off turned to be quasiconvex even when the sufficient conditions were violated.