GTAILGJun 17, 2024

Incentivizing Quality Text Generation via Statistical Contracts

arXiv:2406.11118v216 citations
Originality Incremental advance
AI Analysis

This addresses a moral hazard problem in text generation for users and providers of LLM services, offering an incremental economic solution.

The paper tackles the misalignment of incentives in pay-per-token pricing for LLMs, where agents might use cheaper models to cut costs, by proposing a pay-for-performance contract framework based on automated quality evaluation, and finds that cost-robust contracts sacrifice only a marginal increase in objective value compared to cost-aware ones.

While the success of large language models (LLMs) increases demand for machine-generated text, current pay-per-token pricing schemes create a misalignment of incentives known in economics as moral hazard: Text-generating agents have strong incentive to cut costs by preferring a cheaper model over the cutting-edge one, and this can be done "behind the scenes" since the agent performs inference internally. In this work, we approach this issue from an economic perspective, by proposing a pay-for-performance, contract-based framework for incentivizing quality. We study a principal-agent game where the agent generates text using costly inference, and the contract determines the principal's payment for the text according to an automated quality evaluation. Since standard contract theory is inapplicable when internal inference costs are unknown, we introduce cost-robust contracts. As our main theoretical contribution, we characterize optimal cost-robust contracts through a direct correspondence to optimal composite hypothesis tests from statistics, generalizing a result of Saig et al. (NeurIPS'23). We evaluate our framework empirically by deriving contracts for a range of objectives and LLM evaluation benchmarks, and find that cost-robust contracts sacrifice only a marginal increase in objective value compared to their cost-aware counterparts.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes