AIMar 22, 2024

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

arXiv:2403.15640v19.69 citationsh-index: 1CDC

Originality Incremental advance

AI Analysis

This work addresses decision-making challenges in dynamic environments like smart grids, offering a new framework that is incremental as it integrates existing bandit types.

The paper tackles the problem of complex online decision-making by introducing a novel Contextual Restless Bandits (CRB) framework that combines contextual and restless bandits to model internal state transitions and external contexts, resulting in a scalable index policy algorithm with theoretical asymptotic optimality and demonstrated performance in smart grid demand response applications.

This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making. This CRB framework incorporates the core features of contextual bandits and restless bandits, so that it can model both the internal state transitions of each arm and the influence of external global environmental contexts. Using the dual decomposition method, we develop a scalable index policy algorithm for solving the CRB problem, and theoretically analyze the asymptotical optimality of this algorithm. In the case when the arm models are unknown, we further propose a model-based online learning algorithm based on the index policy to learn the arm models and make decisions simultaneously. Furthermore, we apply the proposed CRB framework and the index policy algorithm specifically to the demand response decision-making problem in smart grids. The numerical simulations demonstrate the performance and efficiency of our proposed CRB approaches.

View on arXiv PDF

Similar