LGAug 5, 2021

PI3NN: Out-of-distribution-aware prediction intervals from three neural networks

Siyan Liu, Pei Zhang, Dan Lu, Guannan Zhang

arXiv:2108.02327v37.513 citations

Originality Incremental advance

AI Analysis

This work improves uncertainty quantification for machine learning practitioners by providing a more robust and efficient method for prediction intervals, though it is incremental as it builds on existing PI methods.

The authors tackled the problem of generating prediction intervals for uncertainty quantification in neural networks, addressing issues like retraining for different confidence levels, hyperparameter sensitivity, and underestimation for out-of-distribution samples, and their PI3NN method outperformed state-of-the-art approaches in benchmark and real-world experiments.

We propose a novel prediction interval (PI) method for uncertainty quantification, which addresses three major issues with the state-of-the-art PI methods. First, existing PI methods require retraining of neural networks (NNs) for every given confidence level and suffer from the crossing issue in calculating multiple PIs. Second, they usually rely on customized loss functions with extra sensitive hyperparameters for which fine tuning is required to achieve a well-calibrated PI. Third, they usually underestimate uncertainties of out-of-distribution (OOD) samples leading to over-confident PIs. Our PI3NN method calculates PIs from linear combinations of three NNs, each of which is independently trained using the standard mean squared error loss. The coefficients of the linear combinations are computed using root-finding algorithms to ensure tight PIs for a given confidence level. We theoretically prove that PI3NN can calculate PIs for a series of confidence levels without retraining NNs and it completely avoids the crossing issue. Additionally, PI3NN does not introduce any unusual hyperparameters resulting in a stable performance. Furthermore, we address OOD identification challenge by introducing an initialization scheme which provides reasonably larger PIs of the OOD samples than those of the in-distribution samples. Benchmark and real-world experiments show that our method outperforms several state-of-the-art approaches with respect to predictive uncertainty quality, robustness, and OOD samples identification.

View on arXiv PDF

Similar