Online Generalized-mean Welfare Maximization: Achieving Near-Optimal Regret from Samples

Zongjun Yang, Rachitesh Kumar, Christian Kroer

arXiv:2602.10469v11.2h-index: 15

Originality Incremental advance

AI Analysis

This work addresses fair resource allocation in dynamic environments, offering practical algorithms with minimal data requirements, though it is incremental in improving regret bounds and robustness.

The paper tackles online fair allocation of items among agents to maximize generalized-mean welfare, showing that a pure greedy algorithm achieves near-optimal regret rates without distributional knowledge in i.i.d. settings and that using only a single historical sample per distribution recovers optimal regret even under non-stationarity.

We study online fair allocation of $T$ sequentially arriving items among $n$ agents with heterogeneous preferences, with the objective of maximizing generalized-mean welfare, defined as the $p$-mean of agents' time-averaged utilities, with $p\in (-\infty, 1)$. We first consider the i.i.d. arrival model and show that the pure greedy algorithm -- which myopically chooses the welfare-maximizing integral allocation -- achieves $\widetilde{O}(1/T)$ average regret. Importantly, in contrast to prior work, our algorithm does not require distributional knowledge and achieves the optimal regret rate using only the online samples. We then go beyond i.i.d. arrivals and investigate a nonstationary model with time-varying independent distributions. In the absence of additional data about the distributions, it is known that every online algorithm must suffer $Ω(1)$ average regret. We show that only a single historical sample from each distribution is sufficient to recover the optimal $\widetilde{O}(1/T)$ average regret rate, even in the face of arbitrary non-stationarity. Our algorithms are based on the re-solving paradigm: they assume that the remaining items will be the ones seen historically in those periods and solve the resulting welfare-maximization problem to determine the decision in every period. Finally, we also account for distribution shifts that may distort the fidelity of historical samples and show that the performance of our re-solving algorithms is robust to such shifts.

View on arXiv PDF

Similar