Zhiyu He

h-index6

4papers

40citations

Novelty43%

AI Score40

Ranked #70,735 of 194,257 authors (top 36%)#234 in OC (top 29%)

4 Papers

7.4OCMay 24

Gray-Box Nonlinear Feedback Optimization

Zhiyu He, Saverio Bolognani, Michael Muehlebach et al.

Feedback optimization enables autonomous optimality seeking of a dynamical system through its closed-loop interconnection with iterative optimization algorithms. Among various iteration structures, model-based approaches require the input-output sensitivity matrix of the system to construct gradients, whereas model-free approaches eliminate this need by estimating gradients from real-time objective evaluations. These approaches offer complementary benefits in sample efficiency and accuracy against model mismatch, i.e., sensitivity errors. To achieve balanced closed-loop performance, we propose a gray-box feedback optimization controller, featuring systematic incorporation of approximate sensitivities into model-free updates via a tunable convex combination. We provide unified performance characterizations covering different approaches. We elucidate how cumulative sensitivity errors (model-based) and variances due to stochastic exploration (model-free) shape the closed-loop behavior and induce a trade-off between iteration and dimensional dependence. The proposed controller retains sample efficiency and provable (local) optimality for nonconvex problems despite inaccurate sensitivities. We further develop and characterize a running gray-box controller that handles constrained time-varying problems with changing objectives and steady-state input-output maps.

7.6SYApr 8

Hierarchical Strategic Decision-Making in Layered Mobility Systems

Mingjia He, Zhiyu He, Jan Ghadamian et al.

Mobility systems are complex socio-technical environments influenced by multiple stakeholders with hierarchically interdependent decisions, rendering effective control and policy design inherently challenging. We bridge hierarchical game-theoretic modeling with online feedback optimization by casting urban mobility as a tri-level Stackelberg game (travelers, operators, municipality) closed in a feedback loop. The municipality iteratively updates taxes, subsidies, and operational constraints using a projected two-point (gradient-free) scheme, while lower levels respond through equilibrium computations (Frank-Wolfe for traveler equilibrium; operator best responses). This model-free pipeline enforces constraints, accommodates heterogeneous users and modes, and scales to higher-dimensional policy vectors without differentiating through equilibrium maps. On a real multimodal network for Zurich, Switzerland, our method attains substantially better municipal objectives than Bayesian optimization and Genetic algorithms, and identifies integration incentives that increase multimodal usage while improving both operator objectives. The results show that feedback-based regulation can steer competition toward cooperative outcomes and deliver tangible welfare gains in complex, data-rich mobility ecosystems.

14.4LGJan 27, 2025

The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective

Michael Muehlebach, Zhiyu He, Michael I. Jordan

We study the sample complexity of online reinforcement learning in the general setting of nonlinear dynamical systems with continuous state and action spaces. Our analysis accommodates a large class of dynamical systems ranging from a finite set of nonlinear candidate models to models with bounded and Lipschitz continuous dynamics, to systems that are parametrized by a compact and real-valued set of parameters. In the most general setting, our algorithm achieves a policy regret of $\mathcal{O}(N ε^2 + \mathrm{ln}(m(ε))/ε^2)$, where $N$ is the time horizon, $ε$ is a user-specified discretization width, and $m(ε)$ measures the complexity of the function class under consideration via its packing number. In the special case where the dynamics are parametrized by a compact and real-valued set of parameters (such as neural networks, transformers, etc.), we prove a policy regret of $\mathcal{O}(\sqrt{N p})$, where $p$ denotes the number of parameters, recovering earlier sample-complexity results that were derived for linear time-invariant dynamical systems. While this article focuses on characterizing sample complexity, the proposed algorithms are likely to be useful in practice, due to their simplicity, their ability to incorporate prior knowledge, and their benign transient behavior.

16.2OCJan 25, 2024

Towards a Systems Theory of Algorithms

Florian Dörfler, Zhiyu He, Giuseppe Belgioioso et al.

Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control strategies, reinforcement learning, decision-making architectures, online optimization, and many more. Further, even {\em closed} algorithms in learning or optimization are increasingly abstracted in block diagrams with interacting dynamic modules and pipelines. In this opinion paper, we state our vision on a to-be-cultivated {\em systems theory of algorithms} and argue in favor of viewing algorithms as open dynamical systems interacting with other algorithms, physical systems, humans, or databases. Remarkably, the manifold tools developed under the umbrella of systems theory are well suited for addressing a range of challenges in the algorithmic domain. We survey various instances where the principles of algorithmic systems theory are being developed and outline pertinent modeling, analysis, and design challenges.