When Does Language Transfer Help? Sequential Fine-Tuning for Cross-Lingual Euphemism Detection
This addresses the challenge of culturally variable euphemism detection for multilingual NLP applications, offering an incremental improvement in transfer learning strategies.
The paper tackled the problem of cross-lingual euphemism detection in low-resource settings by investigating sequential fine-tuning across five languages, finding that it improves performance for languages like Yoruba and Turkish, with XLM-R showing larger gains but more sensitivity to issues like catastrophic forgetting.
Euphemisms are culturally variable and often ambiguous, posing challenges for language models, especially in low-resource settings. This paper investigates how cross-lingual transfer via sequential fine-tuning affects euphemism detection across five languages: English, Spanish, Chinese, Turkish, and Yoruba. We compare sequential fine-tuning with monolingual and simultaneous fine-tuning using XLM-R and mBERT, analyzing how performance is shaped by language pairings, typological features, and pretraining coverage. Results show that sequential fine-tuning with a high-resource L1 improves L2 performance, especially for low-resource languages like Yoruba and Turkish. XLM-R achieves larger gains but is more sensitive to pretraining gaps and catastrophic forgetting, while mBERT yields more stable, though lower, results. These findings highlight sequential fine-tuning as a simple yet effective strategy for improving euphemism detection in multilingual models, particularly when low-resource languages are involved.