CLJun 7, 2024

Mixture-of-Agents Enhances Large Language Model Capabilities

arXiv:2406.04692v1409 citationsHas Code
Originality Highly original
AI Analysis

This addresses the challenge of improving LLM capabilities for natural language tasks by aggregating multiple models, representing a novel method for a known bottleneck in AI.

The paper tackles the problem of harnessing collective expertise from multiple large language models (LLMs) by proposing a Mixture-of-Agents (MoA) methodology, achieving state-of-the-art performance on benchmarks like AlpacaEval 2.0 with a score of 65.1%, surpassing GPT-4 Omni at 57.5%.

Recent advances in large language models (LLMs) demonstrate substantial capabilities in natural language understanding and generation tasks. With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. In our approach, we construct a layered MoA architecture wherein each layer comprises multiple LLM agents. Each agent takes all the outputs from agents in the previous layer as auxiliary information in generating its response. MoA models achieves state-of-art performance on AlpacaEval 2.0, MT-Bench and FLASK, surpassing GPT-4 Omni. For example, our MoA using only open-source LLMs is the leader of AlpacaEval 2.0 by a substantial gap, achieving a score of 65.1% compared to 57.5% by GPT-4 Omni.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes