Visualizing the Diversity of Representations Learned by Bayesian Neural Networks
This work provides incremental insights into making BNN decision-making more interpretable for researchers and practitioners.
The study applied explainable AI methods to visualize and measure the diversity of feature representations in Bayesian Neural Networks, finding that Monte Carlo dropout shows greater diversity than MultiSWAG and that representation diversity correlates with output uncertainty.
Explainable Artificial Intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian Neural Networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the \emph{posterior} distribution in terms of human-understandable feature information with regard to the underlying decision making strategies. The main findings of our work are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout with commonly used Dropout rates exhibit increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimate for the output and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra mode diversity increases. These findings are consistent with the recent Deep Neural Networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.