CLAIJan 30, 2024

Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks

arXiv:2401.17396v117 citationsh-index: 10
Originality Synthesis-oriented
AI Analysis

This work addresses natural language understanding problems for Turkish, an under-resourced language, by providing fine-tuned models and a benchmark, though it is incremental as it applies an existing method to new data.

The authors fine-tuned a Turkish BERT model (BERTurk) on downstream tasks like Named-Entity Recognition and Sentiment Analysis, achieving state-of-the-art results that significantly outperformed existing baselines for Turkish language understanding.

Deep learning-based and lately Transformer-based language models have been dominating the studies of natural language processing in the last years. Thanks to their accurate and fast fine-tuning characteristics, they have outperformed traditional machine learning-based approaches and achieved state-of-the-art results for many challenging natural language understanding (NLU) problems. Recent studies showed that the Transformer-based models such as BERT, which is Bidirectional Encoder Representations from Transformers, have reached impressive achievements on many tasks. Moreover, thanks to their transfer learning capacity, these architectures allow us to transfer pre-built models and fine-tune them to specific NLU tasks such as question answering. In this study, we provide a Transformer-based model and a baseline benchmark for the Turkish Language. We successfully fine-tuned a Turkish BERT model, namely BERTurk that is trained with base settings, to many downstream tasks and evaluated with a the Turkish Benchmark dataset. We showed that our studies significantly outperformed other existing baseline approaches for Named-Entity Recognition, Sentiment Analysis, Question Answering and Text Classification in Turkish Language. We publicly released these four fine-tuned models and resources in reproducibility and with the view of supporting other Turkish researchers and applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes