Offline-Online Reinforcement Learning for Energy Pricing in Office Demand Response: Lowering Energy and Data Costs
This work addresses cost reduction for energy management in office buildings, but it is incremental as it applies existing offline RL techniques to a specific domain.
The paper tackled the problem of high data and implementation costs in training reinforcement learning agents for energy pricing in office demand response by leveraging offline training methods, resulting in accelerated convergence and reduced costs.
Our team is proposing to run a full-scale energy demand response experiment in an office building. Although this is an exciting endeavor which will provide value to the community, collecting training data for the reinforcement learning agent is costly and will be limited. In this work, we examine how offline training can be leveraged to minimize data costs (accelerate convergence) and program implementation costs. We present two approaches to doing so: pretraining our model to warm start the experiment with simulated tasks, and using a planning model trained to simulate the real world's rewards to the agent. We present results that demonstrate the utility of offline reinforcement learning to efficient price-setting in the energy demand response problem.