Online Self-Concordant and Relatively Smooth Minimization, With Applications to Online Portfolio Selection and Learning Quantum States
This work provides incremental improvements in regret bounds for online portfolio selection and quantum state learning, benefiting researchers in optimization and quantum machine learning.
The paper tackles online convex optimization with self-concordant and relatively smooth loss functions, deriving improved regret bounds for online mirror descent. It applies this to online portfolio selection, achieving a regret of $ ilde{O}(T^{2/3}d^{1/3})$ for a variant of exponentiated gradient and $ ilde{O}(\sqrt{Td})$ with a logarithmic barrier, and to online learning of quantum states, obtaining $ ilde{O}(\sqrt{Td})$ regret with faster per-iteration time.
Consider an online convex optimization problem where the loss functions are self-concordant barriers, smooth relative to a convex function $h$, and possibly non-Lipschitz. We analyze the regret of online mirror descent with $h$. Then, based on the result, we prove the following in a unified manner. Denote by $T$ the time horizon and $d$ the parameter dimension. 1. For online portfolio selection, the regret of $\widetilde{\text{EG}}$, a variant of exponentiated gradient due to Helmbold et al., is $\tilde{O} ( T^{2/3} d^{1/3} )$ when $T > 4 d / \log d$. This improves on the original $\tilde{O} ( T^{3/4} d^{1/2} )$ regret bound for $\widetilde{\text{EG}}$. 2. For online portfolio selection, the regret of online mirror descent with the logarithmic barrier is $\tilde{O}(\sqrt{T d})$. The regret bound is the same as that of Soft-Bayes due to Orseau et al. up to logarithmic terms. 3. For online learning quantum states with the logarithmic loss, the regret of online mirror descent with the log-determinant function is also $\tilde{O} ( \sqrt{T d} )$. Its per-iteration time is shorter than all existing algorithms we know.