CLAIJan 15, 2025

Can Large Language Models Predict the Outcome of Judicial Decisions?

arXiv:2501.09768v34 citationsh-index: 10Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses a gap in specialized NLP tasks for low-resource languages, specifically in legal domains, though it is incremental as it applies existing methods to new data.

The paper tackled the problem of applying Large Language Models (LLMs) to Legal Judgment Prediction (LJP) for low-resource languages like Arabic by developing a dataset from Saudi commercial court judgments and benchmarking models like LLaMA-3.2-3B and LLaMA-3.1-8B. The results showed that fine-tuned smaller models achieved comparable performance to larger models while being more resource-efficient.

Large Language Models (LLMs) have shown exceptional capabilities in Natural Language Processing (NLP) across diverse domains. However, their application in specialized tasks such as Legal Judgment Prediction (LJP) for low-resource languages like Arabic remains underexplored. In this work, we address this gap by developing an Arabic LJP dataset, collected and preprocessed from Saudi commercial court judgments. We benchmark state-of-the-art open-source LLMs, including LLaMA-3.2-3B and LLaMA-3.1-8B, under varying configurations such as zero-shot, one-shot, and fine-tuning using LoRA. Additionally, we employed a comprehensive evaluation framework that integrates both quantitative metrics (such as BLEU, ROUGE, and BERT) and qualitative assessments (including Coherence, Legal Language, Clarity, etc.) using an LLM. Our results demonstrate that fine-tuned smaller models achieve comparable performance to larger models in task-specific contexts while offering significant resource efficiency. Furthermore, we investigate the impact of fine-tuning the model on a diverse set of instructions, offering valuable insights into the development of a more human-centric and adaptable LLM. We have made the dataset, code, and models publicly available to provide a solid foundation for future research in Arabic legal NLP.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes