CLAIMar 25, 2024

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

CMU
arXiv:2403.16854v330 citationsh-index: 10ACL
Originality Incremental advance
AI Analysis

This addresses the challenge of building generalist AI systems by synergizing specialized models, offering a plug-and-play solution for dynamic extension, though it is incremental as it builds on existing multi-LLM collaboration methods.

The paper tackles the problem of integrating multiple expert large language models (LLMs) into a unified generalist system by introducing Expert-Token-Routing, which represents experts as tokens in a meta LLM for seamless routing and collaboration, resulting in outperformance of existing multi-LLM paradigms across six diverse expert domains.

We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user's perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes