SY LGMar 7, 2020

Online Residential Demand Response via Contextual Multi-Armed Bandits

arXiv:2003.03627v25.933 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of uncertain customer behaviors in residential demand response for electricity systems, but it is incremental as it builds on existing learning techniques by incorporating contextual factors.

The paper tackles the problem of selecting optimal residential customers for demand response to maximize load reduction under a budget, by formulating it as a contextual multi-armed bandit problem and proposing an online learning algorithm based on Thompson sampling, with numerical simulations demonstrating its effectiveness.

Residential loads have great potential to enhance the efficiency and reliability of electricity systems via demand response (DR) programs. One major challenge in residential DR is to handle the unknown and uncertain customer behaviors. Previous works use learning techniques to predict customer DR behaviors, while the influence of time-varying environmental factors is generally neglected, which may lead to inaccurate prediction and inefficient load adjustment. In this paper, we consider the residential DR problem where the load service entity (LSE) aims to select an optimal subset of customers to maximize the expected load reduction with a financial budget. To learn the uncertain customer behaviors under the environmental influence, we formulate the residential DR as a contextual multi-armed bandit (MAB) problem, and the online learning and selection (OLS) algorithm based on Thompson sampling is proposed to solve it. This algorithm takes the contextual information into consideration and is applicable to complicated DR settings. Numerical simulations are performed to demonstrate the learning effectiveness of the proposed algorithm.

View on arXiv PDF

Similar