CLAIFeb 13, 2023

Can GPT-3 Perform Statutory Reasoning?

arXiv:2302.06100v2140 citationsh-index: 60
AI Analysis

This work assesses a foundational AI model's capability in legal reasoning, highlighting limitations for legal applications.

The paper evaluated GPT-3's ability to perform statutory reasoning on the SARA dataset, achieving better results than previous published benchmarks but identifying clear errors, particularly with simple synthetic statutes where it performed poorly.

Statutory reasoning is the task of reasoning with facts and statutes, which are rules written in natural language by a legislature. It is a basic legal skill. In this paper we explore the capabilities of the most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA. We consider a variety of approaches, including dynamic few-shot prompting, chain-of-thought prompting, and zero-shot prompting. While we achieve results with GPT-3 that are better than the previous best published results, we also identify several types of clear errors it makes. We investigate why these errors happen. We discover that GPT-3 has imperfect prior knowledge of the actual U.S. statutes on which SARA is based. More importantly, we create simple synthetic statutes, which GPT-3 is guaranteed not to have seen during training. We find GPT-3 performs poorly at answering straightforward questions about these simple synthetic statutes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes