CL LGSep 23, 2025

Confidence Calibration in Large Language Model-Based Entity Matching

Iris Kamsteeg, Juan Cardenas-Cartagena, Floris van Beers, Gineke ten Holt, Tsegaye Misikir Tashu, Matias Valdenegro-Toro

arXiv:2509.19557v21 citationsh-index: 7Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)

Originality Synthesis-oriented

AI Analysis

This work addresses confidence calibration for entity matching tasks, which is incremental as it applies existing calibration techniques to a specific domain.

This research tackled the problem of confidence calibration in Large Language Model-based Entity Matching by comparing baseline RoBERTa confidences with calibrated methods like Temperature Scaling, finding that Temperature Scaling reduced Expected Calibration Error scores by up to 23.83%.

This research aims to explore the intersection of Large Language Models and confidence calibration in Entity Matching. To this end, we perform an empirical study to compare baseline RoBERTa confidences for an Entity Matching task against confidences that are calibrated using Temperature Scaling, Monte Carlo Dropout and Ensembles. We use the Abt-Buy, DBLP-ACM, iTunes-Amazon and Company datasets. The findings indicate that the proposed modified RoBERTa model exhibits a slight overconfidence, with Expected Calibration Error scores ranging from 0.0043 to 0.0552 across datasets. We find that this overconfidence can be mitigated using Temperature Scaling, reducing Expected Calibration Error scores by up to 23.83%.

View on arXiv PDF

Similar