Token Is All You Price
This addresses platform pricing and user screening in GenAI applications, offering a simple, implementable solution that decouples training from pricing.
The paper tackles the problem of designing revenue-optimal mechanisms for platforms using GenAI models to screen users with private latency preferences, showing that deploying a single aligned model with token caps as the sole screening instrument achieves this.
We build a mechanism design framework where a platform designs GenAI models to screen users who obtain instrumental value from the generated conversation and privately differ in their preference for latency. We show that the revenue-optimal mechanism is simple: deploy a single aligned (user-optimal) model and use token cap as the only instrument to screen the user. The design decouples model training from pricing, is readily implemented with token metering, and mitigates misalignment pressures.