CL AIMar 15, 2023

On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Zhitao Li, Jing Xiao

arXiv:2303.08606v10.53 citationsh-index: 22

Originality Incremental advance

AI Analysis

This work addresses calibration and uncertainty issues in dialog retrieval models, which is an incremental improvement for enhancing reliability in conversational AI systems.

The paper tackles the problem of unreliable predictions in dialog response retrieval models due to poor calibration and uncertainty estimation, presenting PG-DRR, a framework that adds a Gaussian Process layer and uses Pólya-Gamma augmentation to achieve the lowest empirical calibration error (ECE) while maintaining retrieval performance metrics like R10@1 and MAP.

Deep neural retrieval models have amply demonstrated their power but estimating the reliability of their predictions remains challenging. Most dialog response retrieval models output a single score for a response on how relevant it is to a given question. However, the bad calibration of deep neural network results in various uncertainty for the single score such that the unreliable predictions always misinform user decisions. To investigate these issues, we present an efficient calibration and uncertainty estimation framework PG-DRR for dialog response retrieval models which adds a Gaussian Process layer to a deterministic deep neural network and recovers conjugacy for tractable posterior inference by Pólya-Gamma augmentation. Finally, PG-DRR achieves the lowest empirical calibration error (ECE) in the in-domain datasets and the distributional shift task while keeping $R_{10}@1$ and MAP performance.

View on arXiv PDF

Similar