CLIROct 14, 2022

Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation

arXiv:2210.07544v1307 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of summarizing lengthy legal documents for legal professionals, but it is incremental as it applies existing summarization methods to a new domain.

The paper tackled the problem of summarizing long legal case documents by comparing extractive and abstractive methods, finding that extractive approaches often outperform abstractive ones due to token length constraints, with evaluations involving law practitioners providing key insights.

Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes