AILGDec 6, 2024

Smoothie: Label Free Language Model Routing

arXiv:2412.04692v131 citationsh-index: 19NIPS
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient LLM deployment for engineers by enabling label-free routing, though it is incremental as it builds on prior routing work with a novel unsupervised approach.

The paper tackles the problem of selecting the best large language model (LLM) for each input sample without labeled data, proposing Smoothie, an unsupervised routing method that estimates LLM quality scores and achieves up to 10 points higher accuracy than baselines.

Large language models (LLMs) are increasingly used in applications where LLM inputs may span many different tasks. Recent work has found that the choice of LLM is consequential, and different LLMs may be good for different input samples. Prior approaches have thus explored how engineers might select an LLM to use for each sample (i.e. routing). While existing routing methods mostly require training auxiliary models on human-annotated data, our work explores whether it is possible to perform unsupervised routing. We propose Smoothie, a weak supervision-inspired routing approach that requires no labeled data. Given a set of outputs from different LLMs, Smoothie constructs a latent variable graphical model over embedding representations of observable LLM outputs and unknown "true" outputs. Using this graphical model, we estimate sample-dependent quality scores for each LLM, and route each sample to the LLM with the highest corresponding score. We find that Smoothie's LLM quality-scores correlate with ground-truth model quality (correctly identifying the optimal model on 9/14 tasks), and that Smoothie outperforms baselines for routing by up to 10 points accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes