CLJul 17, 2025

Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?

arXiv:2507.12838v23 citationsh-index: 3EMNLP
Originality Incremental advance
AI Analysis

This addresses the need for reliable cross-lingual transfer and factuality in multilingual AI, though it is incremental in enhancing existing models.

The paper tackles the problem of cross-lingual consistency in multilingual language models for factual knowledge, finding that models exhibit varying consistency levels influenced by language families and scripts, with code-switching training and cross-lingual alignment showing the most promising improvements.

Cross-lingual consistency should be considered to assess cross-lingual transferability, maintain the factuality of the model knowledge across languages, and preserve the parity of language model performance. We are thus interested in analyzing, evaluating, and interpreting cross-lingual consistency for factual knowledge. To facilitate our study, we examine multiple pretrained models and tuned models with code-mixed coreferential statements that convey identical knowledge across languages. Interpretability approaches are leveraged to analyze the behavior of a model in cross-lingual contexts, showing different levels of consistency in multilingual models, subject to language families, linguistic factors, scripts, and a bottleneck in cross-lingual consistency on a particular layer. Code-switching training and cross-lingual word alignment objectives show the most promising results, emphasizing the worthiness of cross-lingual alignment supervision and code-switching strategies for both multilingual performance and cross-lingual consistency enhancement. In addition, experimental results suggest promising result for calibrating consistency in the test time via activation patching.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes