AIJul 15, 2021

Uncertainty-Aware Reliable Text Classification

arXiv:2107.07114v118.843 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for reliable uncertainty estimation in natural language processing, particularly for text classification tasks, though it is incremental as it adapts existing evidential uncertainty methods to a new domain.

The paper tackles the problem of over-confident predictions in text classification under domain shift and out-of-distribution (OOD) examples by applying evidential uncertainty for OOD detection, demonstrating that their model outperforms other methods in detecting OOD examples.

Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks. However, they tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution (OOD) examples exist. Most research on uncertainty estimation focuses on computer vision because it provides visual validation on uncertainty quality. However, few have been presented in the natural language process domain. Unlike Bayesian methods that indirectly infer uncertainty through weight uncertainties, current evidential uncertainty-based methods explicitly model the uncertainty of class probabilities through subjective opinions. They further consider inherent uncertainty in data with different root causes, vacuity (i.e., uncertainty due to a lack of evidence) and dissonance (i.e., uncertainty due to conflicting evidence). In our paper, we firstly apply evidential uncertainty in OOD detection for text classification tasks. We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class, which has high vacuity for OOD samples. Extensive empirical experiments demonstrate that our model based on evidential uncertainty outperforms other counterparts for detecting OOD examples. Our approach can be easily deployed to traditional recurrent neural networks and fine-tuned pre-trained transformers.

View on arXiv PDF Code

Similar