Auditing Pay-Per-Token in Large Language Models
This addresses a financial integrity issue for users of cloud-based LLM services, offering a practical solution to ensure fair billing, though it is incremental as it builds on recent findings about misreporting incentives.
The paper tackles the problem of token misreporting by providers in pay-per-token pricing for large language models, developing an auditing framework that guarantees detection of misreporting and maintains a low false positive rate, with experiments showing detection after fewer than ~70 outputs and a false positive probability below 0.05.
Millions of users rely on a market of cloud-based services to obtain access to state-of-the-art large language models. However, it has been very recently shown that the de facto pay-per-token pricing mechanism used by providers creates a financial incentive for them to strategize and misreport the (number of) tokens a model used to generate an output. In this paper, we develop an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting. Crucially, we show that our framework is guaranteed to always detect token misreporting, regardless of the provider's (mis-)reporting policy, and not falsely flag a faithful provider as unfaithful with high probability. To validate our auditing framework, we conduct experiments across a wide range of (mis-)reporting policies using several large language models from the $\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input prompts from a popular crowdsourced benchmarking platform. The results show that our framework detects an unfaithful provider after observing fewer than $\sim 70$ reported outputs, while maintaining the probability of falsely flagging a faithful provider below $α= 0.05$.