CLAIIRLGFeb 9, 2025

LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification

arXiv:2502.05836v116 citationsh-index: 10NAACL
Originality Incremental advance
AI Analysis

This work addresses a problem for legal professionals and researchers in the field of natural language processing, particularly those working with Indian legal judgments, and presents an incremental step towards improving legal document understanding.

The authors tackled the task of semantic segmentation of Indian legal judgments through rhetorical role classification, achieving advancements in understanding legal documents with their proposed approach and dataset. Their results show that models incorporating broader context outperform those relying solely on sentence-level features.

In this paper, we address the task of semantic segmentation of legal documents through rhetorical role classification, with a focus on Indian legal judgments. We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles. To benchmark performance, we evaluate multiple state-of-the-art models, including Hierarchical BiLSTM-CRF, TransformerOverInLegalBERT (ToInLegalBERT), Graph Neural Networks (GNNs), and Role-Aware Transformers, alongside an exploratory RhetoricLLaMA, an instruction-tuned large language model. Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features. Additionally, we conducted experiments using surrounding context and predicted or actual labels of neighboring sentences to assess their impact on classification accuracy. Despite these advancements, challenges persist in distinguishing between closely related roles and addressing class imbalance. Our work underscores the potential of advanced techniques for improving legal document understanding and sets a strong foundation for future research in legal NLP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes