CLFeb 25, 2025

Harnessing Multiple Large Language Models: A Survey on LLM Ensemble

Zhijun Chen, Jingzheng Li, Pengpeng Chen, Zhuoran Li, Kai Sun, Yuankai Luo, Qianren Mao, Ming Li, Likang Xiao, Dingqi Yang, Yikun Ban, Hailong Sun

arXiv:2502.18036v534.4108 citationsh-index: 9Has Code

Originality Synthesis-oriented

AI Analysis

It addresses the need for organizing and advancing research in LLM Ensemble for AI practitioners, but is incremental as it surveys existing work rather than proposing new methods.

This paper provides the first systematic review of LLM Ensemble, a method that uses multiple large language models to handle user queries, by introducing a taxonomy, classifying methods, and discussing benchmarks and applications.

LLM Ensemble -- which involves the comprehensive use of multiple large language models (LLMs), each aimed at handling user queries during downstream inference, to benefit from their individual strengths -- has gained substantial attention recently. The widespread availability of LLMs, coupled with their varying strengths and out-of-the-box usability, has profoundly advanced the field of LLM Ensemble. This paper presents the first systematic review of recent developments in LLM Ensemble. First, we introduce our taxonomy of LLM Ensemble and discuss several related research problems. Then, we provide a more in-depth classification of the methods under the broad categories of "ensemble-before-inference, ensemble-during-inference, ensemble-after-inference'', and review all relevant methods. Finally, we introduce related benchmarks and applications, summarize existing studies, and suggest several future research directions. A curated list of papers on LLM Ensemble is available at https://github.com/junchenzhi/Awesome-LLM-Ensemble.

View on arXiv PDF Code

Similar