CL AIFeb 24, 2025

Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

Himanshu Beniwal, Sailesh Panda, Birudugadda Srivibhav, Mayank Singh

arXiv:2502.16901v34.91 citationsh-index: 9Has Code

Originality Highly original

AI Analysis

This exposes a critical vulnerability in multilingual systems, allowing attackers to compromise them by poisoning data in a single language, which is a significant security concern for AI safety.

The study tackled cross-lingual backdoor attacks in multilingual LLMs, showing that backdoors inserted in one language can transfer to others via shared embeddings, with rare and high-occurring tokens as effective triggers in toxicity classification.

We explore \textbf{C}ross-lingual \textbf{B}ackdoor \textbf{AT}tacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare and high-occurring tokens serving as specific, effective triggers. Our findings expose a critical vulnerability that influences the model's architecture, resulting in a concealed backdoor effect during the information flow. Our code and data are publicly available https://github.com/himanshubeniwal/X-BAT.

View on arXiv PDF Code

Similar