CLOct 6, 2020

LEGAL-BERT: The Muppets straight out of Law School

arXiv:2010.02559v11108 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for effective NLP tools in legal technology, though it is incremental as it adapts existing methods to a specialized domain.

The paper tackled the problem of adapting BERT models to the legal domain, finding that standard guidelines often fail to generalize, and proposed LEGAL-BERT, a family of models for legal NLP tasks.

BERT has achieved impressive performance in several NLP tasks. However, there has been limited investigation on its adaptation guidelines in specialised domains. Here we focus on the legal domain, where we explore several approaches for applying BERT models to downstream legal tasks, evaluating on multiple datasets. Our findings indicate that the previous guidelines for pre-training and fine-tuning, often blindly followed, do not always generalize well in the legal domain. Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains. These are: (a) use the original BERT out of the box, (b) adapt BERT by additional pre-training on domain-specific corpora, and (c) pre-train BERT from scratch on domain-specific corpora. We also propose a broader hyper-parameter search space when fine-tuning for downstream tasks and we release LEGAL-BERT, a family of BERT models intended to assist legal NLP research, computational law, and legal technology applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes