SYSYMay 6

Second-Order MPC-Based Distributed Q-Learning

arXiv:2511.1642415.1h-index: 12
Predicted impact top 51% in SY · last 90 daysOriginality Incremental advance
AI Analysis

For multi-agent control systems using MPC-based Q-learning, this provides a faster and more stable learning method.

This work extends MPC-based distributed Q-learning to second-order gradient updates, enabling faster convergence and higher learning rates without instability. Simulations show it significantly outperforms first-order distributed Q-learning.

The state of the art for model predictive control (MPC)-based distributed Q-learning is limited to first-order gradient updates of the MPC parameterization. In general, using secondorder information can significantly improve the speed of convergence for learning, allowing the use of higher learning rates without introducing instability. This work presents a second-order extension to MPC-based Q-learning with updates distributed across local agents, relying only on locally available information and neighbor-to-neighbor communication. In simulation the approach is demonstrated to significantly outperform first-order distributed Q-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes