CLOct 20, 2020

Natural Language Inference with Mixed Effects

arXiv:2010.10501v1992 citations
Originality Incremental advance
AI Analysis

This addresses data quality issues in NLP for researchers and practitioners, but it is incremental as it builds on existing mixed effects models.

The authors tackled the problem of annotator disagreement in natural language inference datasets by proposing a method to train directly on raw annotations, avoiding aggregation issues, and demonstrated that it improves performance over models without such effects.

There is growing evidence that the prevalence of disagreement in the raw annotations used to construct natural language inference datasets makes the common practice of aggregating those annotations to a single label problematic. We propose a generic method that allows one to skip the aggregation step and train on the raw annotations directly without subjecting the model to unwanted noise that can arise from annotator response biases. We demonstrate that this method, which generalizes the notion of a \textit{mixed effects model} by incorporating \textit{annotator random effects} into any existing neural model, improves performance over models that do not incorporate such effects.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes