LG MLSep 20, 2019

Characterizing Sources of Uncertainty to Proxy Calibration and Disambiguate Annotator and Data Bias

Asma Ghandeharioun, Brian Eoff, Brendan Jou, Rosalind W. Picard

arXiv:1909.09285v28.120 citationsh-index: 106Has Code

Originality Incremental advance

AI Analysis

This work addresses interpretability and fairness issues in machine learning for subjective domains like emotion recognition, though it is incremental as it builds on existing uncertainty quantification methods.

The paper tackled the challenge of model interpretability in tasks with legitimate annotator disagreement, such as emotion recognition, by quantifying uncertainty using a modified Monte Carlo dropout method. It found that aleatoric uncertainty correlates with human annotator disagreement (r≈0.3) and can identify difficult samples, while epistemic uncertainty reveals data bias, with modest performance improvements observed.

Supporting model interpretability for complex phenomena where annotators can legitimately disagree, such as emotion recognition, is a challenging machine learning task. In this work, we show that explicitly quantifying the uncertainty in such settings has interpretability benefits. We use a simple modification of a classical network inference using Monte Carlo dropout to give measures of epistemic and aleatoric uncertainty. We identify a significant correlation between aleatoric uncertainty and human annotator disagreement ($r\approx.3$). Additionally, we demonstrate how difficult and subjective training samples can be identified using aleatoric uncertainty and how epistemic uncertainty can reveal data bias that could result in unfair predictions. We identify the total uncertainty as a suitable surrogate for model calibration, i.e. the degree we can trust model's predicted confidence. In addition to explainability benefits, we observe modest performance boosts from incorporating model uncertainty.

View on arXiv PDF Code

Similar