Azita Dabiri

SY
h-index11
8papers
20citations
Novelty50%
AI Score48

8 Papers

SYApr 22, 2025
Distributed model predictive control without terminal cost under inexact distributed optimization

Xiaoyu Liu, Dimos V. Dimarogonas, Changxin Liu et al.

This paper presents a novel distributed model predictive control (MPC) formulation without terminal cost and a corresponding distributed synthesis approach for distributed linear discrete-time systems with coupled constraints. The proposed control scheme introduces an explicit stability condition as an additional constraint based on relaxed dynamic programming. As a result, contrary to other related approaches, system stability with the developed controller does not rely on designing a terminal cost. A distributed synthesis approach is then introduced to handle the stability constraint locally within each local agent. To solve the underlying optimization problem for distributed MPC, a violation-free distributed optimization approach is developed, using constraint tightening to ensure feasibility throughout iterations. A numerical example demonstrates that the proposed distributed MPC approach ensures closed-loop stability for each feasible control sequence, with each agent computing its control input in parallel.

47.8SYMay 6
Model Predictive Control and Moving Horizon Estimation using Statistically Weighted Data-Based Ensemble Models

Laura Boca de Giuli, Samuel Mallick, Alessio La Bella et al.

This paper presents a model predictive control (MPC) framework leveraging an ensemble of data-based models to optimally control complex systems under multiple operating conditions. A novel combination rule for ensemble models is proposed, based on the statistical Mahalanobis distance, enabling the ensemble weights to suitably vary across the prediction window based on the system input. In addition, a novel state observer for ensemble models is developed using moving horizon estimation (MHE). The effectiveness of the proposed methodology is demonstrated on a benchmark energy system operating under multiple conditions.

38.6SYMay 6
Second-Order MPC-Based Distributed Q-Learning

Samuel Mallick, Filippo Airaldi, Azita Dabiri et al.

The state of the art for model predictive control (MPC)-based distributed Q-learning is limited to first-order gradient updates of the MPC parameterization. In general, using secondorder information can significantly improve the speed of convergence for learning, allowing the use of higher learning rates without introducing instability. This work presents a second-order extension to MPC-based Q-learning with updates distributed across local agents, relying only on locally available information and neighbor-to-neighbor communication. In simulation the approach is demonstrated to significantly outperform first-order distributed Q-learning.

SYNov 15, 2023
Reinforcement Learning with Model Predictive Control for Highway Ramp Metering

Filippo Airaldi, Bart De Schutter, Azita Dabiri

In the backdrop of an increasingly pressing need for effective urban and highway transportation systems, this work explores the synergy between model-based and learning-based strategies to enhance traffic flow management by use of an innovative approach to the problem of ramp metering control that embeds Reinforcement Learning (RL) techniques within the Model Predictive Control (MPC) framework. The control problem is formulated as an RL task by crafting a suitable stage cost function that is representative of the traffic conditions, variability in the control action, and violations of the constraint on the maximum number of vehicles in queue. An MPC-based RL approach, which leverages the MPC optimal problem as a function approximation for the RL algorithm, is proposed to learn to efficiently control an on-ramp and satisfy its constraints despite uncertainties in the system model and variable demands. Simulations are performed on a benchmark small-scale highway network to compare the proposed methodology against other state-of-the-art control approaches. Results show that, starting from an MPC controller that has an imprecise model and is poorly tuned, the proposed methodology is able to effectively learn to improve the control policy such that congestion in the network is reduced and constraints are satisfied, yielding an improved performance that is superior to the other controllers.

SYSep 17, 2024
Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids

Caio Fabio Oliveira da Silva, Azita Dabiri, Bart De Schutter

This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to solve finite-horizon optimal control problems in mixed-logical dynamical systems efficiently. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer linear programs, which suffer from the curse of dimensionality. Our approach aims to mitigate this issue by decoupling the decision on the discrete variables from the decision on the continuous variables. In the proposed approach, reinforcement learning determines the discrete decision variables and simplifies the online optimization problem of the MPC controller from a mixed-integer linear program to a linear program, significantly reducing the computational time. A fundamental contribution of this work is the definition of the decoupled Q-function, which plays a crucial role in making the learning problem tractable in a combinatorial action space. We motivate the use of recurrent neural networks to approximate the decoupled Q-function and show how they can be employed in a reinforcement learning setting. Simulation experiments on a microgrid system using real-world data demonstrate that the proposed method substantially reduces the online computation time of MPC while maintaining high feasibility and low suboptimality.

28.6SYMar 12
Integrated Online Monitoring and Adaption of Process Model Predictive Controllers

Samuel Mallick, Laura Boca de de Giuli, Alessio La Bella et al.

This paper addresses the design of an event-triggered, data-based, and performance-oriented adaption method for model predictive control (MPC). The performance of such a strategy strongly depends on the accuracy of the prediction model, which may require online adaption to prevent performance degradation under changing operating conditions. Unlike existing methods that continuously update model and control parameters from data, potentially leading to catastrophic forgetting and unnecessary control modifications, we propose a novel approach based on statistical monitoring of closed-loop performance indicators. This framework enables the detection of performance degradation, and, when required, controller adaption is performed via reinforcement learning and identification techniques. The proposed strategy is validated on a high-fidelity simulation of a district heating system benchmark.

LGDec 8, 2025
Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning

Giray Önür, Azita Dabiri, Bart De Schutter

Effective traffic control is essential for mitigating congestion in transportation networks. Conventional traffic management strategies, including route guidance, ramp metering, and traffic signal control, often rely on state feedback controllers, used for their simplicity and reactivity; however, they lack the adaptability required to cope with complex and time-varying traffic dynamics. This paper proposes a multi-agent reinforcement learning framework in which each agent adaptively tunes the parameters of a state feedback traffic controller, combining the reactivity of state feedback controllers with the adaptability of reinforcement learning. By tuning parameters at a lower frequency rather than directly determining control actions at a high frequency, the reinforcement learning agents achieve improved training efficiency while maintaining adaptability to varying traffic conditions. The multi-agent structure further enhances system robustness, as local controllers can operate independently in the event of partial failures. The proposed framework is evaluated on a simulated multi-class transportation network under varying traffic conditions. Results show that the proposed multi-agent framework outperforms the no control and fixed-parameter state feedback control cases, while performing on par with the single-agent RL-based adaptive state feedback control, with a much better resilience to partial failures.

LGDec 6, 2024
Nonmyopic Global Optimisation via Approximate Dynamic Programming

Filippo Airaldi, Bart De Schutter, Azita Dabiri

Unconstrained global optimisation aims to optimise expensive-to-evaluate black-box functions without gradient information. Bayesian optimisation, one of the most well-known techniques, typically employs Gaussian processes as surrogate models, leveraging their probabilistic nature to balance exploration and exploitation. However, Gaussian processes become computationally prohibitive in high-dimensional spaces. Recent alternatives, based on inverse distance weighting (IDW) and radial basis functions (RBFs), offer competitive, computationally lighter solutions. Despite their efficiency, both traditional global and Bayesian optimisation strategies suffer from the myopic nature of their acquisition functions, which focus solely on immediate improvement neglecting future implications of the sequential decision making process. Nonmyopic acquisition functions devised for the Bayesian setting have shown promise in improving long-term performance. Yet, their use in deterministic strategies with IDW and RBF remains unexplored. In this work, we introduce novel nonmyopic acquisition strategies tailored to IDW- and RBF-based global optimisation. Specifically, we develop dynamic programming-based paradigms, including rollout and multi-step scenario-based optimisation schemes, to enable lookahead acquisition. These methods optimise a sequence of query points over a horizon (instead of only at the next step) by predicting the evolution of the surrogate model, inherently managing the exploration-exploitation trade-off in a systematic way via optimisation techniques. The proposed approach represents a significant advance in extending nonmyopic acquisition principles, previously confined to Bayesian optimisation, to the deterministic framework. Empirical results on synthetic and hyperparameter tuning benchmark problems demonstrate that these nonmyopic methods outperform conventional myopic approaches.