CLJul 2, 2025

LLMs for Legal Subsumption in German Employment Contracts

arXiv:2507.01734v1h-index: 1Has CodeICAIL
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable and trustworthy NLP tools to assist lawyers in contract review, but it is incremental as it builds on existing datasets and methods.

The study tackled the problem of evaluating the legality of clauses in German employment contracts using Large Language Models (LLMs) with in-context learning, finding that examination guidelines significantly improved recall for void clauses and weighted F1-Score to 80%, though performance with full-text sources remained below human lawyers.

Legal work, characterized by its text-heavy and resource-intensive nature, presents unique challenges and opportunities for NLP research. While data-driven approaches have advanced the field, their lack of interpretability and trustworthiness limits their applicability in dynamic legal environments. To address these issues, we collaborated with legal experts to extend an existing dataset and explored the use of Large Language Models (LLMs) and in-context learning to evaluate the legality of clauses in German employment contracts. Our work evaluates the ability of different LLMs to classify clauses as "valid," "unfair," or "void" under three legal context variants: no legal context, full-text sources of laws and court rulings, and distilled versions of these (referred to as examination guidelines). Results show that full-text sources moderately improve performance, while examination guidelines significantly enhance recall for void clauses and weighted F1-Score, reaching 80\%. Despite these advancements, LLMs' performance when using full-text sources remains substantially below that of human lawyers. We contribute an extended dataset, including examination guidelines, referenced legal sources, and corresponding annotations, alongside our code and all log files. Our findings highlight the potential of LLMs to assist lawyers in contract legality review while also underscoring the limitations of the methods presented.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes