CLOct 11, 2025

On the Entity-Level Alignment in Crosslingual Consistency

arXiv:2510.10280v11 citationsh-index: 13
Originality Highly original
AI Analysis

This addresses the issue of unreliable multilingual knowledge retrieval for users of LLMs, though it is incremental as it builds on existing alignment concepts.

The paper tackled the problem of inconsistent factual recall across languages in multilingual LLMs by identifying entity misalignment as a key cause, and proposed methods like SubSub and SubInj that improved factual recall accuracy and consistency with substantial gains.

Multilingual large language models (LLMs) are expected to recall factual knowledge consistently across languages. However, the factors that give rise to such crosslingual consistency -- and its frequent failure -- remain poorly understood. In this work, we hypothesize that these inconsistencies may arise from failures in entity alignment, the process of mapping subject and object entities into a shared conceptual space across languages. To test this, we assess alignment through entity-level (subject and object) translation tasks, and find that consistency is strongly correlated with alignment across all studied models, with misalignment of subjects or objects frequently resulting in inconsistencies. Building on this insight, we propose SubSub and SubInj, two effective methods that integrate English translations of subjects into prompts across languages, leading to substantial gains in both factual recall accuracy and consistency. Finally, our mechanistic analysis reveals that these interventions reinforce the entity representation alignment in the conceptual space through model's internal pivot-language processing, offering effective and practical strategies for improving multilingual factual prediction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes