Online Convex Optimization Against Adversaries with Memory and Application to Statistical Arbitrage
This work addresses temporal constraints in online learning for applications like finance, but it is incremental as it extends existing memory-based frameworks from experts to general OCO.
The authors tackled the problem of extending online convex optimization to adversaries with memory, presenting two algorithms that achieve optimal regret bounds for convex and strongly convex losses, with one requiring Lipschitz continuity and the other being more general but complex. They applied these results to statistical arbitrage in finance, devising algorithms for constructing mean-reverting portfolios.
The framework of online learning with memory naturally captures learning problems with temporal constraints, and was previously studied for the experts setting. In this work we extend the notion of learning with memory to the general Online Convex Optimization (OCO) framework, and present two algorithms that attain low regret. The first algorithm applies to Lipschitz continuous loss functions, obtaining optimal regret bounds for both convex and strongly convex losses. The second algorithm attains the optimal regret bounds and applies more broadly to convex losses without requiring Lipschitz continuity, yet is more complicated to implement. We complement our theoretic results with an application to statistical arbitrage in finance: we devise algorithms for constructing mean-reverting portfolios.