CLJul 2, 2025

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings

arXiv:2507.01645v1h-index: 92025 International Conference on Advanced Machine Learning and Data Science (AMLDS)
Originality Incremental advance
AI Analysis

This work addresses the challenge of adapting language models to low-resource languages, which is incremental as it builds on existing transfer methods like MAD-X.

The study investigated how well pre-trained language models transfer to low-resource Indonesian local languages for sentiment analysis in zero-shot settings, finding that multilingual models performed best on seen languages and that MAD-X improved performance without target-language labeled data, with model exposure being the key predictor of success.

In this paper, we investigate the transferability of pre-trained language models to low-resource Indonesian local languages through the task of sentiment analysis. We evaluate both zero-shot performance and adapter-based transfer on ten local languages using models of different types: a monolingual Indonesian BERT, multilingual models such as mBERT and XLM-R, and a modular adapter-based approach called MAD-X. To better understand model behavior, we group the target languages into three categories: seen (included during pre-training), partially seen (not included but linguistically related to seen languages), and unseen (absent and unrelated in pre-training data). Our results reveal clear performance disparities across these groups: multilingual models perform best on seen languages, moderately on partially seen ones, and poorly on unseen languages. We find that MAD-X significantly improves performance, especially for seen and partially seen languages, without requiring labeled data in the target language. Additionally, we conduct a further analysis on tokenization and show that while subword fragmentation and vocabulary overlap with Indonesian correlate weakly with prediction quality, they do not fully explain the observed performance. Instead, the most consistent predictor of transfer success is the model's prior exposure to the language, either directly or through a related language.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes