OCAISYMar 28, 2023

Worst-Case Control and Learning Using Partial Observations Over an Infinite Time-Horizon

arXiv:2303.16321v27 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses safety-critical cyber-physical systems by providing robust control strategies against uncertainties, though it appears incremental as it builds on existing dynamic programming methods with specific enhancements.

The paper tackles the problem of worst-case control and learning in partially observed systems with adversarial disturbances, presenting a framework to minimize discounted cost over an infinite horizon. It introduces information states to improve computational tractability without losing optimality and defines an approximate information state for observable costs, with performance bounds and a numerical example.

Safety-critical cyber-physical systems require control strategies whose worst-case performance is robust against adversarial disturbances and modeling uncertainties. In this paper, we present a framework for approximate control and learning in partially observed systems to minimize the worst-case discounted cost over an infinite time horizon. We model disturbances to the system as finite-valued uncertain variables with unknown probability distributions. For problems with known system dynamics, we construct a dynamic programming (DP) decomposition to compute the optimal control strategy. Our first contribution is to define information states that improve the computational tractability of this DP without loss of optimality. Then, we describe a simplification for a class of problems where the incurred cost is observable at each time instance. Our second contribution is defining an approximate information state that can be constructed or learned directly from observed data for problems with observable costs. We derive bounds on the performance loss of the resulting approximate control strategy and illustrate the effectiveness of our approach in partially observed decision-making problems with a numerical example.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes