SY LGOct 1, 2025

Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC

Ozan Baris Mulayim, Elias N. Pergantis, Levi D. Reyes Premer, Bingqing Chen, Guannan Qu, Kevin J. Kircher, Mario Bergés

arXiv:2510.01475v13.33 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses practical scalability challenges for automated HVAC control in residential settings, though it highlights incremental trade-offs rather than breakthrough solutions.

The study compared reinforcement learning (RL) and model predictive control (MPC) for residential HVAC in a real-world deployment, finding that RL achieved 22% energy savings (slightly higher than MPC's 20%) but with modestly higher occupant discomfort, and MPC performed better when normalized for comfort.

Advanced control strategies like Model Predictive Control (MPC) offer significant energy savings for HVAC systems but often require substantial engineering effort, limiting scalability. Reinforcement Learning (RL) promises greater automation and adaptability, yet its practical application in real-world residential settings remains largely undemonstrated, facing challenges related to safety, interpretability, and sample efficiency. To investigate these practical issues, we performed a direct comparison of an MPC and a model-based RL controller, with each controller deployed for a one-month period in an occupied house with a heat pump system in West Lafayette, Indiana. This investigation aimed to explore scalability of the chosen RL and MPC implementations while ensuring safety and comparability. The advanced controllers were evaluated against each other and against the existing controller. RL achieved substantial energy savings (22\% relative to the existing controller), slightly exceeding MPC's savings (20\%), albeit with modestly higher occupant discomfort. However, when energy savings were normalized for the level of comfort provided, MPC demonstrated superior performance. This study's empirical results show that while RL reduces engineering overhead, it introduces practical trade-offs in model accuracy and operational robustness. The key lessons learned concern the difficulties of safe controller initialization, navigating the mismatch between control actions and their practical implementation, and maintaining the integrity of online learning in a live environment. These insights pinpoint the essential research directions needed to advance RL from a promising concept to a truly scalable HVAC control solution.

View on arXiv PDF

Similar