CLAILGApr 2, 2024

LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain

arXiv:2404.02127v220 citationsh-index: 13NAACL
Originality Incremental advance
AI Analysis

This addresses the need for better language model adaptation in the legal domain, though it is incremental as it builds on existing instruction tuning methods.

The authors tackled the underrepresentation of legal tasks in instruction datasets by creating LawInstruct, a resource with 58 annotated legal datasets covering 12M examples, and found that legal-specific instruction tuning on Flan-T5 improved performance on LegalBench by 15 points or 50% for the base model size without harming general reasoning.

Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructions for each, creating LawInstruct. LawInstruct covers 17 global jurisdictions, 24 languages and a total of 12M examples across diverse tasks such as legal QA, summarization of court cases, and legal argument mining. We evaluate our models on LegalBench, measuring legal reasoning across five categories in 162 challenging and realistic legal tasks, and MMLU, to measure potential drops in general reasoning capabilities. We find that legal-specific instruction tuning on Flan-T5 - yielding FLawN-T5 - improves performance on LegalBench across all model sizes, with an aggregate increase of 15 points or 50% over Flan-T5 for the base size. No model size shows performance drops in MMLU. We publish LawInstruct as a resource for further study of instruction tuning in the legal domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes