CL AIDec 6, 2021

JUSTICE: A Benchmark Dataset for Supreme Court's Judgment Prediction

Mohammad Alali, Shaayan Syed, Mohammed Alsayed, Smit Patel, Hemanth Bodala

arXiv:2112.03414v12.414 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a high-quality dataset for NLP research in the legal domain, addressing a data scarcity problem for researchers and practitioners, though it is incremental as it focuses on dataset creation rather than novel methods.

The authors tackled the lack of well-annotated datasets for Supreme Court of the United States (SCOTUS) cases by creating a benchmark dataset, enabling the use of advanced NLP algorithms to predict court judgments from case facts, with models emulating a human jury to generate verdicts.

Artificial intelligence is being utilized in many domains as of late, and the legal system is no exception. However, as it stands now, the number of well-annotated datasets pertaining to legal documents from the Supreme Court of the United States (SCOTUS) is very limited for public use. Even though the Supreme Court rulings are public domain knowledge, trying to do meaningful work with them becomes a much greater task due to the need to manually gather and process that data from scratch each time. Hence, our goal is to create a high-quality dataset of SCOTUS court cases so that they may be readily used in natural language processing (NLP) research and other data-driven applications. Additionally, recent advances in NLP provide us with the tools to build predictive models that can be used to reveal patterns that influence court decisions. By using advanced NLP algorithms to analyze previous court cases, the trained models are able to predict and classify a court's judgment given the case's facts from the plaintiff and the defendant in textual format; in other words, the model is emulating a human jury by generating a final verdict.

View on arXiv PDF Code

Similar