MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages
This provides a new benchmark for multilingual dialogue generation, addressing a gap for low-resource languages, though it is incremental as it builds on existing models and datasets.
The authors tackled the lack of multilingual dialogue datasets by introducing mDIA, a benchmark covering 46 languages, and found that mT5-based models outperform DialoGPT on some metrics but show a large quality gap between English and other languages.
Owing to the lack of corpora for low-resource languages, current works on dialogue generation have mainly focused on English. In this paper, we present mDIA, the first large-scale multilingual benchmark for dialogue generation across low- to high-resource languages. It covers real-life conversations in 46 languages across 19 language families. We present baseline results obtained by fine-tuning the multilingual, non-dialogue-focused pre-trained model mT5 as well as English-centric, dialogue-focused pre-trained chatbot DialoGPT. The results show that mT5-based models perform better on sacreBLEU and BertScore but worse on diversity. Even though promising results are found in few-shot and zero-shot scenarios, there is a large gap between the generation quality in English and other languages. We hope that the release of mDIA could encourage more works on multilingual dialogue generation to promote language diversity.