SY LGNov 27, 2022

Combined Peak Reduction and Self-Consumption Using Proximal Policy Optimization

Thijs Peirelinck, Chris Hermans, Fred Spiessens, Geert Deconinck

arXiv:2211.14831v12.36 citationsh-index: 21

Originality Synthesis-oriented

AI Analysis

This work addresses data efficiency in demand response for households, but it is incremental as it builds on existing transfer learning and PPO methods.

The paper tackled the challenge of data efficiency in reinforcement learning for residential demand response by incorporating domain knowledge into the learning pipeline, resulting in a 14.51% cost reduction compared to a hysteresis controller and 6.68% compared to traditional PPO.

Residential demand response programs aim to activate demand flexibility at the household level. In recent years, reinforcement learning (RL) has gained significant attention for these type of applications. A major challenge of RL algorithms is data efficiency. New RL algorithms, such as proximal policy optimisation (PPO), have tried to increase data efficiency. Additionally, combining RL with transfer learning has been proposed in an effort to mitigate this challenge. In this work, we further improve upon state-of-the-art transfer learning performance by incorporating demand response domain knowledge into the learning pipeline. We evaluate our approach on a demand response use case where peak shaving and self-consumption is incentivised by means of a capacity tariff. We show our adapted version of PPO, combined with transfer learning, reduces cost by 14.51% compared to a regular hysteresis controller and by 6.68% compared to traditional PPO.

View on arXiv PDF

Similar