Improving robustness and calibration in ensembles with diversity regularization
This work addresses calibration and uncertainty estimation in high-risk environments, offering an incremental improvement by regularizing ensemble diversity to enhance performance under dataset shift.
The paper tackles the problem of improving calibration and robustness in ensembles for classification tasks by introducing a new diversity regularizer that uses out-of-distribution samples, resulting in increased accuracy, better calibration, and enhanced out-of-distribution detection capabilities, with experiments on datasets like CIFAR-10, CIFAR-100, and SVHN showing significant impacts.
Calibration and uncertainty estimation are crucial topics in high-risk environments. We introduce a new diversity regularizer for classification tasks that uses out-of-distribution samples and increases the overall accuracy, calibration and out-of-distribution detection capabilities of ensembles. Following the recent interest in the diversity of ensembles, we systematically evaluate the viability of explicitly regularizing ensemble diversity to improve calibration on in-distribution data as well as under dataset shift. We demonstrate that diversity regularization is highly beneficial in architectures, where weights are partially shared between the individual members and even allows to use fewer ensemble members to reach the same level of robustness. Experiments on CIFAR-10, CIFAR-100, and SVHN show that regularizing diversity can have a significant impact on calibration and robustness, as well as out-of-distribution detection.