Adaptive Online Non-stochastic Control
This work addresses control in non-stochastic environments for applications like robotics or autonomous systems, offering adaptive performance improvements but is incremental in its method integration.
The paper tackles the problem of Non-stochastic Control by developing algorithms with policy regret proportional to environmental difficulty, achieving sub-linear data-adaptive regret bounds that improve with small cost gradients.
We tackle the problem of Non-stochastic Control (NSC) with the aim of obtaining algorithms whose policy regret is proportional to the difficulty of the controlled environment. Namely, we tailor the Follow The Regularized Leader (FTRL) framework to dynamical systems by using regularizers that are proportional to the actual witnessed costs. The main challenge arises from using the proposed adaptive regularizers in the presence of a state, or equivalently, a memory, which couples the effect of the online decisions and requires new tools for bounding the regret. Via new analysis techniques for NSC and FTRL integration, we obtain novel disturbance action controllers (DAC) with sub-linear data adaptive policy regret bounds that shrink when the trajectory of costs has small gradients, while staying sub-linear even in the worst case.