AI IRMar 27, 2024

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

Shengjie Ma, Qi Chu, Jiaxin Mao, Xuhui Jiang, Haozhe Duan, Chong Chen

arXiv:2403.18405v317.221 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses the challenge of time-consuming and expertise-intensive legal case retrieval for legal professionals, though it is incremental in applying LLMs to a specific domain.

The paper tackled the problem of making reliable and interpretable relevance judgments in legal case retrieval by proposing a novel few-shot approach using large language models (LLMs), which decomposes the judgment process to mimic human annotators and incorporates expert reasoning. The result showed that the approach yields reliable assessments comparable to human experts and enables knowledge transfer to smaller models via annotation-based distillation.

Determining which legal cases are relevant to a given query involves navigating lengthy texts and applying nuanced legal reasoning. Traditionally, this task has demanded significant time and domain expertise to identify key Legal Facts and reach sound juridical conclusions. In addition, existing data with legal case similarities often lack interpretability, making it difficult to understand the rationale behind relevance judgments. With the growing capabilities of large language models (LLMs), researchers have begun investigating their potential in this domain. Nonetheless, the method of employing a general large language model for reliable relevance judgments in legal case retrieval remains largely unexplored. To address this gap in research, we propose a novel few-shot approach where LLMs assist in generating expert-aligned interpretable relevance judgments. The proposed approach decomposes the judgment process into several stages, mimicking the workflow of human annotators and allowing for the flexible incorporation of expert reasoning to improve the accuracy of relevance judgments. Importantly, it also ensures interpretable data labeling, providing transparency and clarity in the relevance assessment process. Through a comparison of relevance judgments made by LLMs and human experts, we empirically demonstrate that the proposed approach can yield reliable and valid relevance assessments. Furthermore, we demonstrate that with minimal expert supervision, our approach enables a large language model to acquire case analysis expertise and subsequently transfers this ability to a smaller model via annotation-based knowledge distillation.

View on arXiv PDF

Similar