CLJun 16, 2025

Are manual annotations necessary for statutory interpretations retrieval?

arXiv:2506.13965v11 citationsh-index: 8ICAIL
Originality Incremental advance
AI Analysis

This work addresses the costly and repetitive manual annotation bottleneck for legal professionals and researchers in statutory interpretation retrieval systems.

This paper investigates whether manual annotations are necessary for retrieving statutory interpretations by determining the optimal number of annotations per legal concept, evaluating annotation selection strategies, and testing LLM-based automation. The results show that performance gains plateau with minimal annotations, targeted annotation selection improves model performance, and LLM automation can reduce manual effort while maintaining effectiveness.

One of the elements of legal research is looking for cases where judges have extended the meaning of a legal concept by providing interpretations of what a concept means or does not mean. This allow legal professionals to use such interpretations as precedents as well as laymen to better understand the legal concept. The state-of-the-art approach for retrieving the most relevant interpretations for these concepts currently depends on the ranking of sentences and the training of language models over annotated examples. That manual annotation process can be quite expensive and need to be repeated for each such concept, which prompted recent research in trying to automate this process. In this paper, we highlight the results of various experiments conducted to determine the volume, scope and even the need for manual annotation. First of all, we check what is the optimal number of annotations per a legal concept. Second, we check if we can draw the sentences for annotation randomly or there is a gain in the performance of the model, when only the best candidates are annotated. As the last question we check what is the outcome of automating the annotation process with the help of an LLM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes