CLLGOct 21, 2020

German's Next Language Model

arXiv:2010.10906v41017 citations
Originality Synthesis-oriented
AI Analysis

This work provides incremental improvements for German NLP applications by optimizing training data and techniques.

The authors tackled the problem of creating state-of-the-art German language models by experimenting with BERT and ELECTRA variants, achieving top performance in document classification and named entity recognition tasks.

In this work we present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA. By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition (NER) tasks for both models of base and large size. We adopt an evaluation driven approach in training these models and our results indicate that both adding more data and utilizing WWM improve model performance. By benchmarking against existing German models, we show that these models are the best German models to date. Our trained models will be made publicly available to the research community.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes