GTMay 12

Social Welfare under Heterogeneous Time Preferences

Sarvin Bahmani, Soumyajit Paul, Sven Schewe, Shadi Tasdighi Kalat, Ashutosh Trivedi

arXiv:2605.1225119.7

Predicted impact top 56% in GT · last 90 daysOriginality Incremental advance

AI Analysis

For AI agents making decisions in multi-principal settings (e.g., resource allocation, climate policy), this work provides a tractable framework for social welfare optimization under heterogeneous time preferences.

This paper introduces heterogeneous time preferences in MDPs, where principals have distinct discount factors, and studies the synthesis of agent strategies maximizing utilitarian social welfare. It shows optimal strategies require only polynomial memory and can be synthesized in polynomial time, while optimal positional strategies are NP-hard and suboptimal.

In several socioeconomic-critical decision-making settings, such as fair resource allocation, climate policy, or AI alignment, multiple principals interact within a common arena. While it is well established that these principals may have differing preferences, decision-making under heterogeneous time preferences remains relatively unexplored. In particular, principals may weigh future outcomes differently and may derive distinct utilities from the same decisions. Motivated by such scenarios, we introduce the notion of heterogeneous time preferences in MDPs, where multiple principals possess distinct reward functions and apply different discount factors to future rewards. To compute meaningful decisions in such settings, an AI agent must rely on a notion of optimality that accounts for the preferences of all principals. We adopt a utilitarian notion of social welfare, defined as the sum of utilities accrued to all principals, and study the synthesis of agent strategies that maximise this welfare. Under heterogeneous time preferences, we show that optimal strategies are no longer positional, even when all principals receive identical rewards. Nevertheless, optimal strategies remain structurally simple: they can be realized as pure finite-memory counting strategies, require only polynomial memory in the system size, and can be synthesized in polynomial time. On the other hand, we show that deciding threshold questions for optimal positional strategies is NP-hard, exposing a poor trade-off: insisting on positional simplicity neither makes synthesis tractable nor preserves social welfare.

View on arXiv PDF

Similar