CLFeb 16, 2024

Python is Not Always the Best Choice: Embracing Multilingual Program of Thoughts

Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

arXiv:2402.10691v414.928 citationsh-index: 14Has CodeEMNLP

Originality Incremental advance

AI Analysis

This addresses the problem of suboptimal reasoning in AI models by moving beyond a single programming language, offering a more flexible and effective solution for researchers and practitioners in natural language processing and AI reasoning.

The paper tackles the limitation of using only Python in Program of Thoughts (PoT) by proposing MultiPoT, a multilingual approach that leverages diverse programming languages, resulting in significant performance gains, such as over 4.6% average improvement on ChatGPT compared to Python Self-Consistency.

Program of Thoughts (PoT) is an approach characterized by its executable intermediate steps, which ensure the accuracy of the logical calculations in the reasoning process. Currently, PoT primarily uses Python. However, relying solely on a single language may result in suboptimal solutions and overlook the potential benefits of other programming languages. In this paper, we conduct comprehensive experiments on the programming languages used in PoT and find that no single language consistently delivers optimal performance across all tasks and models. The effectiveness of each language varies depending on the specific scenarios. Inspired by this, we propose a task and model agnostic approach called MultiPoT, which harnesses strength and diversity from various languages. Experimental results reveal that it significantly outperforms Python Self-Consistency. Furthermore, it achieves comparable or superior performance compared to the best monolingual PoT in almost all tasks across all models. In particular, MultiPoT achieves more than 4.6% improvement on average on ChatGPT (gpt-3.5-turbo-0701).

View on arXiv PDF Code

Similar