CLIRJan 21

Supporting Humans in Evaluating AI Summaries of Legal Depositions

arXiv:2601.15182v1h-index: 6CHIIR
Originality Incremental advance
AI Analysis

This work addresses the need for accurate summary evaluation in the legal domain, but it is incremental as it builds on existing nugget-based methods.

The paper tackles the problem of evaluating AI-generated summaries of legal depositions by adapting nugget-based methods to assist end users, resulting in a prototype that supports legal professionals in comparing summaries and improving generated ones.

While large language models (LLMs) are increasingly used to summarize long documents, this trend poses significant challenges in the legal domain, where the factual accuracy of deposition summaries is crucial. Nugget-based methods have been shown to be extremely helpful for the automated evaluation of summarization approaches. In this work, we translate these methods to the user side and explore how nuggets could directly assist end users. Although prior systems have demonstrated the promise of nugget-based evaluation, its potential to support end users remains underexplored. Focusing on the legal domain, we present a prototype that leverages a factual nugget-based approach to support legal professionals in two concrete scenarios: (1) determining which of two summaries is better, and (2) manually improving an automatically generated summary.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes