JuniperLiu at CoMeDi Shared Task: Models as Annotators in Lexical Semantics Disagreements
This work addresses lexical semantics disagreements for NLP researchers, but it is incremental as it builds on existing methods for a shared task.
The paper tackled predicting majority votes and annotator disagreements in lexical semantics by combining model ensembles with MLP-based and threshold-based methods, showing effectiveness especially for disagreement prediction, with findings that standard deviation on continuous scores correlates with human disagreement.
We present the results of our system for the CoMeDi Shared Task, which predicts majority votes (Subtask 1) and annotator disagreements (Subtask 2). Our approach combines model ensemble strategies with MLP-based and threshold-based methods trained on pretrained language models. Treating individual models as virtual annotators, we simulate the annotation process by designing aggregation measures that incorporate continuous relatedness scores and discrete classification labels to capture both majority and disagreement. Additionally, we employ anisotropy removal techniques to enhance performance. Experimental results demonstrate the effectiveness of our methods, particularly for Subtask 2. Notably, we find that standard deviation on continuous relatedness scores among different model manipulations correlates with human disagreement annotations compared to metrics on aggregated discrete labels. The code will be published at https://github.com/RyanLiut/CoMeDi_Solution.