An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
This addresses the challenge of building generalist AI systems by synergizing specialized models, offering a plug-and-play solution for dynamic extension, though it is incremental as it builds on existing multi-LLM collaboration methods.
The paper tackles the problem of integrating multiple expert large language models (LLMs) into a unified generalist system by introducing Expert-Token-Routing, which represents experts as tokens in a meta LLM for seamless routing and collaboration, resulting in outperformance of existing multi-LLM paradigms across six diverse expert domains.
We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user's perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs.