CLApr 12, 2025

Improving the Accuracy and Efficiency of Legal Document Tagging with Large Language Models and Instruction Prompts

arXiv:2504.09309v12 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately and efficiently tagging legal documents, which is crucial for organizing and accessing vast legal documentation, though it appears incremental as it builds on existing LLM capabilities for a specific domain.

The paper tackled the problem of legal multi-label classification by proposing Legal-LLM, a method that fine-tunes Large Language Models with instruction prompts to output legal categories directly, resulting in improved performance on benchmark datasets like POSTURE50K and EURLEX57K, as measured by micro-F1 and macro-F1 scores.

Legal multi-label classification is a critical task for organizing and accessing the vast amount of legal documentation. Despite its importance, it faces challenges such as the complexity of legal language, intricate label dependencies, and significant label imbalance. In this paper, we propose Legal-LLM, a novel approach that leverages the instruction-following capabilities of Large Language Models (LLMs) through fine-tuning. We reframe the multi-label classification task as a structured generation problem, instructing the LLM to directly output the relevant legal categories for a given document. We evaluate our method on two benchmark datasets, POSTURE50K and EURLEX57K, using micro-F1 and macro-F1 scores. Our experimental results demonstrate that Legal-LLM outperforms a range of strong baseline models, including traditional methods and other Transformer-based approaches. Furthermore, ablation studies and human evaluations validate the effectiveness of our approach, particularly in handling label imbalance and generating relevant and accurate legal labels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes