Incentive Design without Hypergradients: A Social-Gradient Method
For system planners in multi-agent settings with information asymmetry, this provides a practical, hypergradient-free method for incentive design, though it assumes observable equilibrium responses or two-timescale dynamics.
The paper tackles incentive design under information asymmetry, where the planner lacks knowledge of agents' cost functions. It proposes a hypergradient-free social-gradient flow method that provably converges to the socially optimal incentive without requiring equilibrium sensitivities, and validates this with numerical experiments.
Incentive design problems consider a system planner who steers self-interested agents toward a socially optimal Nash equilibrium by issuing incentives in the presence of information asymmetry, that is, uncertainty about the agents' cost functions. A common approach formulates the problem as a Mathematical Program with Equilibrium Constraints (MPEC) and optimizes incentives using hypergradients-the total derivatives of the planner's objective with respect to incentives. However, computing or approximating the hypergradients typically requires full or partial knowledge of equilibrium sensitivities to incentives, which is generally unavailable under information asymmetry. In this paper, we propose a hypergradient-free incentive law, called the social-gradient flow, for incentive design when the planner's social cost depends on the agents' joint actions. We prove that the social cost gradient is always a descent direction for the planner's objective, irrespective of the agent cost landscape. In the idealized setting where equilibrium responses are observable, the social-gradient flow converges to the unique socially optimal incentive. When equilibria are not directly observable, the social-gradient flow emerges as the slow-timescale limit of a two-timescale interaction, in which agents' strategies evolve on a faster timescale. It is established that the joint strategy-incentive dynamics converge to the social optimum for any agent learning rule that asymptotically tracks the equilibrium. Theoretical results are also validated via numerical experiments.