Optimal amortized regret in every interval
This addresses the challenge of providing strong performance guarantees in all intervals for online learning, which is incremental over prior work that offered weaker bounds like O(√(x log T)).
The paper tackles the problem of achieving optimal amortized regret in every interval for the next-bit prediction task, presenting a randomized algorithm that achieves O(√x) regret in any interval of length x, with an empirically estimated constant of around 2.1 for T up to 2000.
Consider the classical problem of predicting the next bit in a sequence of bits. A standard performance measure is {\em regret} (loss in payoff) with respect to a set of experts. For example if we measure performance with respect to two constant experts one that always predicts 0's and another that always predicts 1's it is well known that one can get regret $O(\sqrt T)$ with respect to the best expert by using, say, the weighted majority algorithm. But this algorithm does not provide performance guarantee in any interval. There are other algorithms that ensure regret $O(\sqrt {x \log T})$ in any interval of length $x$. In this paper we show a randomized algorithm that in an amortized sense gets a regret of $O(\sqrt x)$ for any interval when the sequence is partitioned into intervals arbitrarily. We empirically estimated the constant in the $O()$ for $T$ upto 2000 and found it to be small -- around 2.1. We also experimentally evaluate the efficacy of this algorithm in predicting high frequency stock data.