Proximal Online Gradient is Optimum for Dynamic Regret
This provides a foundational solution for applications like online recommendation where user preferences evolve, though it is incremental as it builds on existing methods.
The paper tackles the problem of identifying the optimal algorithm for dynamic regret in online learning, where the reference solution changes over time, and shows that proximal online gradient is optimum by matching a lower bound with an improved upper bound.
In online learning, the dynamic regret metric chooses the reference (optimal) solution that may change over time, while the typical (static) regret metric assumes the reference solution to be constant over the whole time horizon. The dynamic regret metric is particularly interesting for applications such as online recommendation (since the customers' preference always evolves over time). While the online gradient method has been shown to be optimal for the static regret metric, the optimal algorithm for the dynamic regret remains unknown. In this paper, we show that proximal online gradient (a general version of online gradient) is optimum to the dynamic regret by showing that the proved lower bound matches the upper bound that slightly improves existing upper bound.