An Empirical Analysis of the Advantages of Finite- v.s. Infinite-Width Bayesian Neural Networks
This work addresses a methodological gap for researchers in Bayesian deep learning, though it is incremental as it builds on existing BNN theory.
The paper tackled the challenge of comparing Bayesian neural networks (BNNs) of varying widths by empirically analyzing finite- vs. infinite-width BNNs, finding that increasing width can harm performance under model misspecification, with finite-width BNNs generalizing better due to their frequency spectrum properties.
Comparing Bayesian neural networks (BNNs) with different widths is challenging because, as the width increases, multiple model properties change simultaneously, and, inference in the finite-width case is intractable. In this work, we empirically compare finite- and infinite-width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the model is mis-specified, increasing width can hurt BNN performance. In these cases, we provide evidence that finite-width BNNs generalize better partially due to the properties of their frequency spectrum that allows them to adapt under model mismatch.