Legal Transformer Models May Not Always Help
It addresses the problem of efficiently applying transformers to legal NLP for practitioners, but is incremental in evaluating existing techniques.
This work investigates domain adaptive pre-training and language adapters for legal NLP tasks, finding that domain adaptive pre-training only helps with low-resource tasks and adapters can match full tuning performance with lower costs, while releasing LegalRoBERTa as a model.
Deep learning-based Natural Language Processing methods, especially transformers, have achieved impressive performance in the last few years. Applying those state-of-the-art NLP methods to legal activities to automate or simplify some simple work is of great value. This work investigates the value of domain adaptive pre-training and language adapters in legal NLP tasks. By comparing the performance of language models with domain adaptive pre-training on different tasks and different dataset splits, we show that domain adaptive pre-training is only helpful with low-resource downstream tasks, thus far from being a panacea. We also benchmark the performance of adapters in a typical legal NLP task and show that they can yield similar performance to full model tuning with much smaller training costs. As an additional result, we release LegalRoBERTa, a RoBERTa model further pre-trained on legal corpora.