Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback
This work addresses the need for inclusive, multilingual AI-assisted education tools, though it is incremental in exploring existing LLMs for a new application.
The paper tackled the problem of whether large language models (LLMs) can provide effective instructional feedback across multiple languages for math tasks, finding that multilingual hints significantly improve learning outcomes, especially in low-resource languages when feedback aligns with the student's native language.
Large language models (LLMs) have demonstrated the ability to generate formative feedback and instructional hints in English, making them increasingly relevant for AI-assisted education. However, their ability to provide effective instructional support across different languages, especially for mathematically grounded reasoning tasks, remains largely unexamined. In this work, we present the first large-scale simulation of multilingual tutor-student interactions using LLMs. A stronger model plays the role of the tutor, generating feedback in the form of hints, while a weaker model simulates the student. We explore 352 experimental settings across 11 typologically diverse languages, four state-of-the-art LLMs, and multiple prompting strategies to assess whether language-specific feedback leads to measurable learning gains. Our study examines how student input language, teacher feedback language, model choice, and language resource level jointly influence performance. Results show that multilingual hints can significantly improve learning outcomes, particularly in low-resource languages when feedback is aligned with the student's native language. These findings offer practical insights for developing multilingual, LLM-based educational tools that are both effective and inclusive.