CLAug 5, 2024

To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction

arXiv:2408.02257v126 citationsh-index: 36
Originality Synthesis-oriented
AI Analysis

This addresses annotation subjectivity in legal text span prediction, but it is incremental as it focuses on a specific aggregation method for an existing task.

The paper tackles the problem of predicting text spans that support legal area labels in layperson-written legal problem descriptions, where annotation subjectivity arises from differing lawyer opinions. Experiments found that training on majority-voted spans outperformed training on disaggregated ones, with a 5% F1 score improvement.

This paper explores the task of automatic prediction of text spans in a legal problem description that support a legal area label. We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers. Inherent subjectivity exists in our task because legal area categorisation is a complex task, and lawyers often have different views on a problem, especially in the face of legally-imprecise descriptions of issues. Experiments show that training on majority-voted spans outperforms training on disaggregated ones.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes