Florian DÃ¶rfler

10papers

4citations

Novelty54%

AI Score50

Ranked #43,385 of 201,018 authors (top 22%)#155 in SY (top 16%)

10 Papers

SYMar 30

Optimistic Online LQR via Intrinsic Rewards

Marcell Bartos, Bruce D. Lee, Lenart Treven et al.

Optimism in the face of uncertainty is a popular approach to balance exploration and exploitation in reinforcement learning. Here, we consider the online linear quadratic regulator (LQR) problem, i.e., to learn the LQR corresponding to an unknown linear dynamical system by adapting the control policy online based on closed-loop data collected during operation. In this work, we propose Intrinsic Rewards LQR (IR-LQR), an optimistic online LQR algorithm that applies the idea of intrinsic rewards originating from reinforcement learning and the concept of variance regularization to promote uncertainty-driven exploration. IR-LQR retains the structure of a standard LQR synthesis problem by only modifying the cost function, resulting in an intuitively pleasing, simple, computationally cheap, and efficient algorithm. This is in contrast to existing optimistic online LQR formulations that rely on more complicated iterative search algorithms or solve computationally demanding optimization problems. We show that IR-LQR achieves the optimal worst-case regret rate of $\sqrt{T}$, and compare it to various state-of-the-art online LQR algorithms via numerical experiments carried out on an aircraft pitch angle control and an unmanned aerial vehicle example.

SYApr 8

Hierarchical Strategic Decision-Making in Layered Mobility Systems

Mingjia He, Zhiyu He, Jan Ghadamian et al.

Mobility systems are complex socio-technical environments influenced by multiple stakeholders with hierarchically interdependent decisions, rendering effective control and policy design inherently challenging. We bridge hierarchical game-theoretic modeling with online feedback optimization by casting urban mobility as a tri-level Stackelberg game (travelers, operators, municipality) closed in a feedback loop. The municipality iteratively updates taxes, subsidies, and operational constraints using a projected two-point (gradient-free) scheme, while lower levels respond through equilibrium computations (Frank-Wolfe for traveler equilibrium; operator best responses). This model-free pipeline enforces constraints, accommodates heterogeneous users and modes, and scales to higher-dimensional policy vectors without differentiating through equilibrium maps. On a real multimodal network for Zurich, Switzerland, our method attains substantially better municipal objectives than Bayesian optimization and Genetic algorithms, and identifies integration incentives that increase multimodal usage while improving both operator objectives. The results show that feedback-based regulation can steer competition toward cooperative outcomes and deliver tangible welfare gains in complex, data-rich mobility ecosystems.

SYSep 8, 2025

Gaussian behaviors: representations and data-driven control

András Sasfi, Ivan Markovsky, Alberto Padoan et al.

We propose a modeling framework for stochastic systems, termed Gaussian behaviors, that describes finite-length trajectories of a system as a Gaussian process. The proposed model naturally quantifies the uncertainty in the trajectories, yet it is simple enough to allow for tractable formulations. We relate the proposed model to existing descriptions of dynamical systems including deterministic and stochastic behaviors, and linear time-invariant (LTI) state-space models with Gaussian noise. Gaussian behaviors can be estimated directly from observed data as the empirical sample covariance. The distribution of future outputs conditioned on inputs and past outputs provides a predictive model that can be incorporated in predictive control frameworks. We show that subspace predictive control is a certainty-equivalence control formulation with the estimated Gaussian behavior. Furthermore, the regularized data-enabled predictive control (DeePC) method is shown to be a distributionally optimistic formulation that optimistically accounts for uncertainty in the Gaussian behavior. To mitigate the excessive optimism of DeePC, we propose a novel distributionally robust control formulation, and provide a convex reformulation allowing for efficient implementation.

SYApr 7

Model-Free Power System Stability Enhancement with Dissipativity-Based Neural Control

Yifei Wang, Han Wang, Kehao Zhuang et al.

The integration of converter-interfaced generation introduces new transient stability challenges to modern power systems. Classical Lyapunov- and scalable passivity-based approaches typically rely on restrictive assumptions, and finding storage functions for large grids is generally considered intractable. Furthermore, most methods require an accurate grid dynamics model. To address these challenges, we propose a model-free, nonlinear, and dissipativity-based controller which, when applied to grid-connected virtual synchronous generators (VSGs), enhances power system transient stability. Using input-state data, we train neural networks to learn dissipativity-characterizing matrices that yield stabilizing controllers. Furthermore, we incorporate cost function shaping to improve the performance with respect to the user-specified objectives. Numerical results on a modified, all-VSG Kundur two-area power system validate the effectiveness of the proposed approach.

OCApr 7

Scaled Graph Containment for Feedback Stability: Soft-Hard Equivalence and Conic Regions

Eder Baron-Prada, Julius P. J. Krebbekx, Adolfo Anta et al.

Scaled graphs (SGs) offer a geometric framework for feedback stability analysis. This paper develops containment conditions for SGs within multiplier-defined regions, addressing both circular and conic geometries. For circular regions, we show that soft and hard SG containment are equivalent whenever the associated multiplier is positive-negative. This enables hard stability certification from soft computations alone, bypassing both the positive semidefinite storage constraint and the homotopy condition of existing methods. Numerical experiments on systems with up to 300 states demonstrate computational savings of 15-44 % for the circular containment framework. We further characterize which conic regions are hyperbolically convex, a condition our frequency-domain certificate requires, and demonstrate that such regions provide tighter SG bounds than circles whenever the operator SG is nonsymmetric.

SYMar 14

On the Impact of Operating Points on Small-Signal Stability: Decentralized Stability Sets via Scaled Relative Graphs

Eder Baron-Prada, Adolfo Anta, Florian Dörfler

This paper presents a decentralized frequency-domain framework to characterize the influence of the operating point on the small-signal stability of converter-dominated power systems. The approach builds on Scaled Relative Graph (SRG) analysis, extended here to address Linear Parameter-Varying (LPV) systems. By exploiting the affine dependence of converter admittances on their steady-state operating points, the centralized small-signal stability assessment of the grid is decomposed into decentralized, frequency-wise geometric tests. Each converter can independently evaluate its feasible stability region, expressed as a set of linear inequalities in its parameter space. The framework provides closed-form geometric characterizations applicable to both grid-following (GFL) and grid-forming (GFM) converters, and validation results confirm its effectiveness.

OCApr 10

A Bayesian Perspective on the Data-Driven LQR

Thierry Schwaller, Feiran Zhao, Florian Dörfler

The data-driven linear quadratic regulator (ddLQR) is a widely studied control method for unknown dynamical systems with disturbance. Existing approaches, both indirect, i.e., those that identify a model followed by model-based design, and direct, which bypasses the identification step, often rely on the certainty-equivalence principle and therefore do not explicitly account for model uncertainty. In this paper, we propose a Bayesian formulation for both indirect and direct ddLQR that incorporates posterior uncertainty into the control design. The resulting expected cost decomposes into a certainty-equivalence term and a variance-dependent term, providing a principled interpretation of regularization. We further show that the indirect and direct formulations are equivalent under this perspective. The resulting direct method admits a tractable semidefinite program whose size is independent of the data length. Numerical simulations demonstrate improved optimality gap and closed-loop stability, particularly in low-data regimes.

SYMar 17

Data-driven generalized perimeter control: ZÃ¼rich case study

Alessio Rimoldi, Carlo Cenedese, Alberto Padoan et al.

Urban traffic congestion is a key challenge for the development of modern cities, requiring advanced control techniques to optimize existing infrastructures usage. Despite the extensive availability of data, modeling such complex systems remains an expensive and time consuming step when designing model-based control approaches. On the other hand, machine learning approaches require simulations to bootstrap models, or are unable to deal with the sparse nature of traffic data and enforce hard constraints. We propose a novel formulation of traffic dynamics based on behavioral systems theory and apply data-enabled predictive control to steer traffic dynamics via dynamic traffic light control. A high-fidelity simulation of the city of ZÃ¼rich, the largest closed-loop microscopic simulation of urban traffic in the literature to the best of our knowledge, is used to validate the performance of the proposed method in terms of total travel time and CO2 emissions.

SYApr 1

Soft projections for robust data-driven control

András Sasfi, Jaap Eising, Florian Dörfler

We consider data-based predictive control based on behavioral systems theory. In the linear setting this means that a system is described as a subspace of trajectories, and predictive control can be formulated using a projection onto the intersection of this behavior and a constraint set. Instead of learning the model, or subspace, we focus on determining this projection from data. Motivated by the use of regularization in data-enabled predictive control (DeePC), we introduce the use of soft projections, which approximate the true projector onto the behavior from noisy data. In the simplest case, these are equivalent to known regularized DeePC schemes, but they exhibit a number of benefits. First, we provide a bound on the approximation error consisting of a bias and a variance term that can be traded-off by the regularization weight. The derived bound is independent of the true system order, highlighting the benefit of soft projections compared to low-dimensional subspace estimates. Moreover, soft projections allow for intuitive generalizations, one of which we show has superior performance on a case study. Finally, we provide update formulas for soft projectors enabling the efficient adaptation of the proposed data-driven control methods in the case of streaming data.

SYMar 13

Next-Generation Grid Codes: Towards a New Paradigm for Dynamic Ancillary Services

Verena Häberle, Kehao Zhuang, Xiuqiang He et al.

This paper introduces a conceptual foundation for Next Generation Grid Codes (NGGCs) based on stability and performance certificates, enabling the provision of dynamic ancillary services such as fast frequency and voltage regulation through decentralized frequency-domain criteria. The NGGC framework offers two key benefits: (i) rigorous closed-loop stability guarantees, and (ii) explicit performance guarantees for frequency and voltage dynamics in power systems. Regarding (i) stability, we employ loop-shifting and passivity-based techniques to derive local frequency-domain stability certificates for individual device dynamics. These certificates ensure the closed-loop stability of the entire interconnected power system through fully decentralized verification. Concerning (ii) performance, we establish quantitative bounds on critical time-domain indicators of system dynamics, including the average-mode frequency and voltage nadirs, the rate-of-change-of-frequency (RoCoF), steady-state deviations, and oscillation damping capabilities. The bounds are obtained by expressing the performance metrics as frequency-domain conditions on local device behavior. The NGGC framework is non-parametric, model-agnostic, and accommodates arbitrary device dynamics under mild assumptions. It thus provides a unified, decentralized approach to certifying both stability and performance without requiring explicit device-model parameterizations. Moreover, the NGGC framework can be directly used as a set of specifications for control design, offering a principled foundation for future stability- and performance-oriented grid codes in power systems.