CL AIFeb 13, 2023

Can GPT-3 Perform Statutory Reasoning?

Andrew Blair-Stanek, Nils Holzenberger, Benjamin Van Durme

arXiv:2302.06100v216.6145 citationsh-index: 60Has Code

Originality Incremental advance

AI Analysis

This work assesses a foundational AI model's capability in legal reasoning, highlighting limitations for legal applications.

The paper evaluated GPT-3's ability to perform statutory reasoning on the SARA dataset, achieving better results than previous published benchmarks but identifying clear errors, particularly with simple synthetic statutes where it performed poorly.

Statutory reasoning is the task of reasoning with facts and statutes, which are rules written in natural language by a legislature. It is a basic legal skill. In this paper we explore the capabilities of the most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA. We consider a variety of approaches, including dynamic few-shot prompting, chain-of-thought prompting, and zero-shot prompting. While we achieve results with GPT-3 that are better than the previous best published results, we also identify several types of clear errors it makes. We investigate why these errors happen. We discover that GPT-3 has imperfect prior knowledge of the actual U.S. statutes on which SARA is based. More importantly, we create simple synthetic statutes, which GPT-3 is guaranteed not to have seen during training. We find GPT-3 performs poorly at answering straightforward questions about these simple synthetic statutes.

View on arXiv PDF Code

Similar