Large Language Models for Multilingual Code Intelligence: A Survey
It provides a structured overview for researchers and practitioners working on multilingual code intelligence, but is incremental as it surveys existing work without proposing new methods or results.
This survey reviews methods, benchmarks, and evaluation metrics for multilingual code generation and translation using large language models, highlighting the bias toward high-resource languages and the need for robust cross-language generalization.
Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems are inherently polyglot, robust multilingual code intelligence is crucial. This survey focuses on two key tasks: multilingual code generation from shared natural-language requirements, and multilingual code translation that preserves semantics across languages. It reviews representative methods, benchmarks, and evaluation metrics, and highlights challenges and opportunities for trustworthy cross-language generalization.