LGCLMLDec 14, 2023

Well-calibrated Confidence Measures for Multi-label Text Classification with a Large Number of Labels

arXiv:2312.09304v152 citationsh-index: 21Pattern Recognition
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in confidence calibration for multi-label text classification, which is incremental but practically useful for applications requiring reliable predictions with many labels.

The paper tackles the computational inefficiency of Label Powerset Inductive Conformal Prediction (LP-ICP) for multi-label text classification with many labels by eliminating low-probability label-sets, reducing complexity while maintaining guarantees. Experimental results show that a contextualized-based classifier achieves state-of-the-art performance on three datasets, with prediction sets being tight and well-calibrated, handling over 1e+16 label-set combinations.

We extend our previous work on Inductive Conformal Prediction (ICP) for multi-label text classification and present a novel approach for addressing the computational inefficiency of the Label Powerset (LP) ICP, arrising when dealing with a high number of unique labels. We present experimental results using the original and the proposed efficient LP-ICP on two English and one Czech language data-sets. Specifically, we apply the LP-ICP on three deep Artificial Neural Network (ANN) classifiers of two types: one based on contextualised (bert) and two on non-contextualised (word2vec) word-embeddings. In the LP-ICP setting we assign nonconformity scores to label-sets from which the corresponding p-values and prediction-sets are determined. Our approach deals with the increased computational burden of LP by eliminating from consideration a significant number of label-sets that will surely have p-values below the specified significance level. This reduces dramatically the computational complexity of the approach while fully respecting the standard CP guarantees. Our experimental results show that the contextualised-based classifier surpasses the non-contextualised-based ones and obtains state-of-the-art performance for all data-sets examined. The good performance of the underlying classifiers is carried on to their ICP counterparts without any significant accuracy loss, but with the added benefits of ICP, i.e. the confidence information encapsulated in the prediction sets. We experimentally demonstrate that the resulting prediction sets can be tight enough to be practically useful even though the set of all possible label-sets contains more than $1e+16$ combinations. Additionally, the empirical error rates of the obtained prediction-sets confirm that our outputs are well-calibrated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes