CLDec 15, 2019

Multilingual is not enough: BERT for Finnish

arXiv:1912.07076v1314 citations
Originality Incremental advance
AI Analysis

This addresses the performance gap for lower-resourced languages in NLP, providing a domain-specific improvement for Finnish language processing.

The paper tackled the problem of multilingual BERT models underperforming for lower-resourced languages like Finnish by training a Finnish-specific BERT model, which established new state-of-the-art results on tasks such as part-of-speech tagging, named entity recognition, and dependency parsing, systematically outperforming the multilingual model.

Deep learning-based language models pretrained on large unannotated text corpora have been demonstrated to allow efficient transfer learning for natural language processing, with recent approaches such as the transformer-based BERT model advancing the state of the art across a variety of tasks. While most work on these models has focused on high-resource languages, in particular English, a number of recent efforts have introduced multilingual models that can be fine-tuned to address tasks in a large number of different languages. However, we still lack a thorough understanding of the capabilities of these models, in particular for lower-resourced languages. In this paper, we focus on Finnish and thoroughly evaluate the multilingual BERT model on a range of tasks, comparing it with a new Finnish BERT model trained from scratch. The new language-specific model is shown to systematically and clearly outperform the multilingual. While the multilingual model largely fails to reach the performance of previously proposed methods, the custom Finnish BERT model establishes new state-of-the-art results on all corpora for all reference tasks: part-of-speech tagging, named entity recognition, and dependency parsing. We release the model and all related resources created for this study with open licenses at https://turkunlp.org/finbert .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes