LG OCSep 25, 2024

Risk-averse learning with delayed feedback

Siyi Wang, Zifan Wang, Karl Henrik Johansson, Sandra Hirche

arXiv:2409.16866v22.6h-index: 4

Originality Incremental advance

AI Analysis

This work addresses risk management in learning systems with delayed feedback, which is incremental as it extends existing zeroth-order methods to handle delays.

The paper tackles risk-averse learning with delayed feedback by developing two zeroth-order optimization algorithms using Conditional Value at Risk (CVaR), analyzing their dynamic regrets in terms of cumulative delay and total samplings, and showing that the two-point algorithm achieves a smaller regret bound than the one-point one.

In real-world scenarios, risk-averse learning is valuable for mitigating potential adverse outcomes. However, the delayed feedback makes it challenging to assess and manage risk effectively. In this paper, we investigate risk-averse learning using Conditional Value at Risk (CVaR) as risk measure, while incorporating feedback with random but bounded delays. We develop two risk-averse learning algorithms that rely on one-point and two-point zeroth-order optimization approaches, respectively. The dynamic regrets of the algorithms are analyzed in terms of the cumulative delay and the number of total samplings. In the absence of delay, the regret bounds match the established bounds of zeroth-order stochastic gradient methods for risk-averse learning. Furthermore, the two-point risk-averse learning outperforms the one-point algorithm by achieving a smaller regret bound. We provide numerical experiments on a dynamic pricing problem to demonstrate the performance of the algorithms.

View on arXiv PDF

Similar