CL LGOct 15, 2024

LegalLens Shared Task 2024: Legal Violation Identification in Unstructured Text

Ben Hagag, Liav Harpaz, Gil Semo, Dor Bernsohn, Rohit Saha, Pashootan Vaezipoor, Kyryl Truskovskyi, Gerasimos Spanakis

arXiv:2410.12064v113.822 citationsh-index: 20NLLP

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of automating legal violation detection for legal professionals and researchers, but it is incremental as it builds on existing methods with modest performance gains.

The paper tackled the problem of identifying legal violations in unstructured text through a shared task, resulting in top teams achieving a 7.11% improvement in named entity recognition and a 5.7% improvement in natural language inference over baselines using fine-tuned pre-trained language models.

This paper presents the results of the LegalLens Shared Task, focusing on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. Using an enhanced LegalLens dataset covering labor, privacy, and consumer protection domains, 38 teams participated in the task. Our analysis reveals that while a mix of approaches was used, the top-performing teams in both tasks consistently relied on fine-tuning pre-trained language models, outperforming legal-specific models and few-shot methods. The top-performing team achieved a 7.11% improvement in NER over the baseline, while NLI saw a more marginal improvement of 5.7%. Despite these gains, the complexity of legal texts leaves room for further advancements.

View on arXiv PDF

Similar