CLAug 21, 2015

Posterior calibration and exploratory analysis for natural language processing models

arXiv:1508.05154v228.8154 citations

Originality Incremental advance

AI Analysis

This work addresses the need for better uncertainty quantification in NLP, particularly for users relying on model outputs in tasks like political event analysis, though it is incremental in building on existing calibration concepts.

The paper tackles the problem of evaluating the calibration of probabilistic NLP models by proposing a method to assess whether their posterior probabilities correspond to empirical frequencies, and applies it to show miscalibration in common models, while also introducing a coreference sampling algorithm that creates confidence intervals for event extraction.

Many models in natural language processing define probabilistic distributions over linguistic structures. We argue that (1) the quality of a model' s posterior distribution can and should be directly evaluated, as to whether probabilities correspond to empirical frequencies, and (2) NLP uncertainty can be projected not only to pipeline components, but also to exploratory data analysis, telling a user when to trust and not trust the NLP analysis. We present a method to analyze calibration, and apply it to compare the miscalibration of several commonly used models. We also contribute a coreference sampling algorithm that can create confidence intervals for a political event extraction task.

View on arXiv PDF

Similar