CLAIOct 23, 2021

Spanish Legalese Language Model and Corpora

arXiv:2110.12201v123 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for domain-specific language models in Spanish legal contexts, but it is incremental as it applies existing methods to new data.

The authors tackled the lack of specialized Spanish language models by creating a legal-domain model and corpora, achieving reasonable results on general Spanish tasks.

There are many Language Models for the English language according to its worldwide relevance. However, for the Spanish language, even if it is a widely spoken language, there are very few Spanish Language Models which result to be small and too general. Legal slang could be think of a Spanish variant on its own as it is very complicated in vocabulary, semantics and phrase understanding. For this work we gathered legal-domain corpora from different sources, generated a model and evaluated against Spanish general domain tasks. The model provides reasonable results in those tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes