CL AIAug 20, 2025

Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference

Samir Abdaljalil, Erchin Serpedin, Khalid Qaraqe, Hasan Kurban

arXiv:2508.14735v16.71 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of consistent cross-lingual reasoning in LLMs for multilingual AI applications, offering incremental insights into robustness.

The researchers tackled the problem of evaluating multilingual and code-switched alignment in large language models (LLMs) by developing a synthetic natural language inference framework, finding that code-switching can improve performance, with specific gains in accuracy across diverse languages.

Large language models (LLMs) are increasingly applied in multilingual contexts, yet their capacity for consistent, logically grounded alignment across languages remains underexplored. We present a controlled evaluation framework for multilingual natural language inference (NLI) that generates synthetic, logic-based premise-hypothesis pairs and translates them into a typologically diverse set of languages. This design enables precise control over semantic relations and allows testing in both monolingual and mixed-language (code-switched) conditions. Surprisingly, code-switching does not degrade, and can even improve, performance, suggesting that translation-induced lexical variation may serve as a regularization signal. We validate semantic preservation through embedding-based similarity analyses and cross-lingual alignment visualizations, confirming the fidelity of translated pairs. Our findings expose both the potential and the brittleness of current LLM cross-lingual reasoning, and identify code-switching as a promising lever for improving multilingual robustness. Code available at: https://github.com/KurbanIntelligenceLab/nli-stress-testing

View on arXiv PDF Code

Similar