CLOct 2, 2021

Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark

arXiv:2110.00806v1670 citations
Originality Synthesis-oriented
AI Analysis

This provides a new dataset for legal AI research, but it is incremental as it extends existing methods to new languages and a specific jurisdiction.

The authors tackled the problem of court delays by creating a multilingual legal judgment prediction benchmark, achieving approximately 68-70% Macro-F1-Score in German and French using hierarchical BERT.

In many jurisdictions, the excessive workload of courts leads to high delays. Suitable predictive AI models can assist legal professionals in their work, and thus enhance and speed up the process. So far, Legal Judgment Prediction (LJP) datasets have been released in English, French, and Chinese. We publicly release a multilingual (German, French, and Italian), diachronic (2000-2020) corpus of 85K cases from the Federal Supreme Court of Switzerland (FSCS). We evaluate state-of-the-art BERT-based methods including two variants of BERT that overcome the BERT input (text) length limitation (up to 512 tokens). Hierarchical BERT has the best performance (approx. 68-70% Macro-F1-Score in German and French). Furthermore, we study how several factors (canton of origin, year of publication, text length, legal area) affect performance. We release both the benchmark dataset and our code to accelerate future research and ensure reproducibility.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes