Maximizing Overall Diversity for Improved Uncertainty Estimates in Deep Ensembles
This work addresses uncertainty estimation for neural networks, which is crucial for reliable AI applications in domains like bioinformatics and image analysis, though it appears incremental as it builds on existing ensemble methods.
The paper tackled the problem of inaccurate uncertainty estimates for out-of-distribution inputs in neural networks by proposing Maximize Overall Diversity (MOD), a method to enhance ensemble diversity, resulting in significant improvements in predictive performance on out-of-distribution test examples across 38 Protein-DNA binding, 9 UCI, and IMDB-Wiki datasets without compromising in-distribution accuracy.
The inaccuracy of neural network models on inputs that do not stem from the training data distribution is both problematic and at times unrecognized. Model uncertainty estimation can address this issue, where uncertainty estimates are often based on the variation in predictions produced by a diverse ensemble of models applied to the same input. Here we describe Maximize Overall Diversity (MOD), a straightforward approach to improve ensemble-based uncertainty estimates by encouraging larger overall diversity in ensemble predictions across all possible inputs that might be encountered in the future. When applied to various neural network ensembles, MOD significantly improves predictive performance for out-of-distribution test examples without sacrificing in-distribution performance on 38 Protein-DNA binding regression datasets, 9 UCI datasets, and the IMDB-Wiki image dataset. Across many Bayesian optimization tasks, the performance of UCB acquisition is also greatly improved by leveraging MOD uncertainty estimates.