CLAIApr 15, 2021

Sublanguage: A Serious Issue Affects Pretrained Models in Legal Domain

arXiv:2104.07782v2
Originality Synthesis-oriented
AI Analysis

This tackles the issue of inaccurate AI applications in the legal domain, but it is incremental as it adapts an existing method to a specific domain.

The paper addresses the problem of pretrained models failing to understand legal English, a specialized sublanguage, by introducing BERTLaw, a legal sublanguage pretrained model. Experiments show it outperforms baseline pretrained models, though no concrete numbers are provided.

Legal English is a sublanguage that is important for everyone but not for everyone to understand. Pretrained models have become best practices among current deep learning approaches for different problems. It would be a waste or even a danger if these models were applied in practice without knowledge of the sublanguage of the law. In this paper, we raise the issue and propose a trivial solution by introducing BERTLaw a legal sublanguage pretrained model. The paper's experiments demonstrate the superior effectiveness of the method compared to the baseline pretrained model

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes