A Learning-based Optimal Market Bidding Strategy for Price-Maker Energy Storage
This addresses the challenge for load serving entities to optimize bidding strategies safely and efficiently, though it is incremental as it builds on existing reinforcement learning and model-based control techniques.
The paper tackles the problem of energy storage bidding in electricity markets by developing a supervised Actor-Critic algorithm that learns to adjust bids based on price impacts, resulting in higher profits compared to model-based methods.
Load serving entities with storage units reach sizes and performances that can significantly impact clearing prices in electricity markets. Nevertheless, price endogeneity is rarely considered in storage bidding strategies and modeling the electricity market is a challenging task. Meanwhile, model-free reinforcement learning such as the Actor-Critic are becoming increasingly popular for designing energy system controllers. Yet implementation frequently requires lengthy, data-intense, and unsafe trial-and-error training. To fill these gaps, we implement an online Supervised Actor-Critic (SAC) algorithm, supervised with a model-based controller -- Model Predictive Control (MPC). The energy storage agent is trained with this algorithm to optimally bid while learning and adjusting to its impact on the market clearing prices. We compare the supervised Actor-Critic algorithm with the MPC algorithm as a supervisor, finding that the former reaps higher profits via learning. Our contribution, thus, is an online and safe SAC algorithm that outperforms the current model-based state-of-the-art.