LG MLOct 21, 2024

Revisiting the Equivalence of Bayesian Neural Networks and Gaussian Processes: On the Importance of Learning Activations

Marcin Sendera, Amin Sorkhei, Tomasz Kuśmierczyk

arXiv:2410.15777v34.62 citationsh-index: 1Has CodeUAI

Originality Incremental advance

AI Analysis

This work addresses the challenge of combining the scalability of BNNs with the uncertainty modeling of GPs for machine learning practitioners, though it is incremental as it builds on prior equivalence research.

The paper tackled the problem of making Bayesian Neural Networks (BNNs) replicate Gaussian Process (GP) behavior by emphasizing the importance of learning activations, achieving performance that matches or outperforms existing methods with stronger theoretical foundations.

Gaussian Processes (GPs) provide a convenient framework for specifying function-space priors, making them a natural choice for modeling uncertainty. In contrast, Bayesian Neural Networks (BNNs) offer greater scalability and extendability but lack the advantageous properties of GPs. This motivates the development of BNNs capable of replicating GP-like behavior. However, existing solutions are either limited to specific GP kernels or rely on heuristics. We demonstrate that trainable activations are crucial for effective mapping of GP priors to wide BNNs. Specifically, we leverage the closed-form 2-Wasserstein distance for efficient gradient-based optimization of reparameterized priors and activations. Beyond learned activations, we also introduce trainable periodic activations that ensure global stationarity by design, and functional priors conditioned on GP hyperparameters to allow efficient model selection. Empirically, our method consistently outperforms existing approaches or matches performance of the heuristic methods, while offering stronger theoretical foundations.

View on arXiv PDF Code

Similar