AISep 13, 2022

LegalBench: Prototyping a Collaborative Benchmark for Legal Reasoning

Neel Guha, Daniel E. Ho, Julian Nyarko, Christopher Ré

arXiv:2209.06120v119.423 citationsh-index: 23Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for standardized evaluation in legal AI, but it is incremental as it prototypes a benchmark rather than solving the reasoning problem directly.

The authors tackled the problem of evaluating foundation models on legal reasoning tasks by proposing LegalBench, a collaborative benchmark built using the IRAC framework from legal scholarship, and they presented an initial seed set of 44 tasks to guide future development.

Can foundation models be guided to execute tasks involving legal reasoning? We believe that building a benchmark to answer this question will require sustained collaborative efforts between the computer science and legal communities. To that end, this short paper serves three purposes. First, we describe how IRAC-a framework legal scholars use to distinguish different types of legal reasoning-can guide the construction of a Foundation Model oriented benchmark. Second, we present a seed set of 44 tasks built according to this framework. We discuss initial findings, and highlight directions for new tasks. Finally-inspired by the Open Science movement-we make a call for the legal and computer science communities to join our efforts by contributing new tasks. This work is ongoing, and our progress can be tracked here: https://github.com/HazyResearch/legalbench.

View on arXiv PDF Code

Similar