CLNov 19, 2024

JuniperLiu at CoMeDi Shared Task: Models as Annotators in Lexical Semantics Disagreements

arXiv:2411.12147v219 citationsh-index: 2Has CodeCOLING Workshops
Originality Incremental advance
AI Analysis

This work addresses lexical semantics disagreements for NLP researchers, but it is incremental as it builds on existing methods for a shared task.

The paper tackled predicting majority votes and annotator disagreements in lexical semantics by combining model ensembles with MLP-based and threshold-based methods, showing effectiveness especially for disagreement prediction, with findings that standard deviation on continuous scores correlates with human disagreement.

We present the results of our system for the CoMeDi Shared Task, which predicts majority votes (Subtask 1) and annotator disagreements (Subtask 2). Our approach combines model ensemble strategies with MLP-based and threshold-based methods trained on pretrained language models. Treating individual models as virtual annotators, we simulate the annotation process by designing aggregation measures that incorporate continuous relatedness scores and discrete classification labels to capture both majority and disagreement. Additionally, we employ anisotropy removal techniques to enhance performance. Experimental results demonstrate the effectiveness of our methods, particularly for Subtask 2. Notably, we find that standard deviation on continuous relatedness scores among different model manipulations correlates with human disagreement annotations compared to metrics on aggregated discrete labels. The code will be published at https://github.com/RyanLiut/CoMeDi_Solution.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes