ITLGJan 24, 2022

Analytic Mutual Information in Bayesian Neural Networks

arXiv:2201.09815v37 citations
AI Analysis

This work addresses a foundational gap in information-theoretic understanding of Bayesian neural networks, with practical implications for uncertainty quantification and active learning.

The authors derived an analytical formula for mutual information between model parameters and predictive output in Bayesian neural networks using point process entropy, and demonstrated its application in improving active learning performance.

Bayesian neural networks have successfully designed and optimized a robust neural network model in many application problems, including uncertainty quantification. However, with its recent success, information-theoretic understanding about the Bayesian neural network is still at an early stage. Mutual information is an example of an uncertainty measure in a Bayesian neural network to quantify epistemic uncertainty. Still, no analytic formula is known to describe it, one of the fundamental information measures to understand the Bayesian deep learning framework. In this paper, we derive the analytical formula of the mutual information between model parameters and the predictive output by leveraging the notion of the point process entropy. Then, as an application, we discuss the parameter estimation of the Dirichlet distribution and show its practical application in the active learning uncertainty measures by demonstrating that our analytical formula can improve the performance of active learning further in practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes