MLLGNISTFeb 5, 2025

CARROT: A Cost Aware Rate Optimal Router

arXiv:2502.03261v221 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses cost efficiency in LLM deployment for users managing multiple models, though it is incremental as it builds on existing routing concepts with a new theoretical analysis and dataset.

The paper tackles the problem of routing queries to the cheapest suitable Large Language Model (LLM) by introducing CARROT, a router that predicts cost and accuracy, achieving minimax optimality and empirically validating it against alternatives on datasets like SPROUT and Routerbench.

With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. We conduct a minimax analysis of the routing problem, providing a lower bound and finding that a simple router that predicts both cost and accuracy for each question can be minimax optimal. Inspired by this, we introduce CARROT, a Cost AwaRe Rate Optimal rouTer that selects a model based on estimates of the models' cost and performance. Alongside CARROT, we also introduce the Smart Price-aware ROUTing (SPROUT) dataset to facilitate routing on a wide spectrum of queries with the latest state-of-the-art LLMs. Using SPROUT and prior benchmarks such as Routerbench and open-LLM-leaderboard-v2 we empirically validate CARROT's performance against several alternative routers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes