CL LGJan 9

Multilingual Amnesia: On the Transferability of Unlearning in Multilingual LLMs

Alireza Dehghanpour Farashah, Aditi Khandelwal, Marylou Fauchard, Zhuan Shi, Negar Rostamzadeh, Golnoosh Farnadi

Microsoft

arXiv:2601.05641v11.12 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses safety and fairness challenges in multilingual AI systems, but it is incremental as it extends existing unlearning methods to new languages.

The study tackled the problem of machine unlearning in multilingual large language models, finding that unlearning is more stable in high-resource languages and that syntactic similarity best predicts cross-lingual transfer effects.

As multilingual large language models become more widely used, ensuring their safety and fairness across diverse linguistic contexts presents unique challenges. While existing research on machine unlearning has primarily focused on monolingual settings, typically English, multilingual environments introduce additional complexities due to cross-lingual knowledge transfer and biases embedded in both pretraining and fine-tuning data. In this work, we study multilingual unlearning using the Aya-Expanse 8B model under two settings: (1) data unlearning and (2) concept unlearning. We extend benchmarks for factual knowledge and stereotypes to ten languages through translation: English, French, Arabic, Japanese, Russian, Farsi, Korean, Hindi, Hebrew, and Indonesian. These languages span five language families and a wide range of resource levels. Our experiments show that unlearning in high-resource languages is generally more stable, with asymmetric transfer effects observed between typologically related languages. Furthermore, our analysis of linguistic distances indicates that syntactic similarity is the strongest predictor of cross-lingual unlearning behavior.

View on arXiv PDF

Similar