OCMay 24
Gray-Box Nonlinear Feedback OptimizationZhiyu He, Saverio Bolognani, Michael Muehlebach et al.
Feedback optimization enables autonomous optimality seeking of a dynamical system through its closed-loop interconnection with iterative optimization algorithms. Among various iteration structures, model-based approaches require the input-output sensitivity matrix of the system to construct gradients, whereas model-free approaches eliminate this need by estimating gradients from real-time objective evaluations. These approaches offer complementary benefits in sample efficiency and accuracy against model mismatch, i.e., sensitivity errors. To achieve balanced closed-loop performance, we propose a gray-box feedback optimization controller, featuring systematic incorporation of approximate sensitivities into model-free updates via a tunable convex combination. We provide unified performance characterizations covering different approaches. We elucidate how cumulative sensitivity errors (model-based) and variances due to stochastic exploration (model-free) shape the closed-loop behavior and induce a trade-off between iteration and dimensional dependence. The proposed controller retains sample efficiency and provable (local) optimality for nonconvex problems despite inaccurate sensitivities. We further develop and characterize a running gray-box controller that handles constrained time-varying problems with changing objectives and steady-state input-output maps.
OCOct 2, 2012
A distributed control strategy for reactive power compensation in smart microgridsSaverio Bolognani, Sandro Zampieri
We consider the problem of optimal reactive power compensation for the minimization of power distribution losses in a smart microgrid. We first propose an approximate model for the power distribution network, which allows us to cast the problem into the class of convex quadratic, linearly constrained, optimization problems. We then consider the specific problem of commanding the microgenerators connected to the microgrid, in order to achieve the optimal injection of reactive power. For this task, we design a randomized, gossip-like optimization algorithm. We show how a distributed approach is possible, where microgenerators need to have only a partial knowledge of the problem parameters and of the state, and can perform only local measurements. For the proposed algorithm, we provide conditions for convergence together with an analytic characterization of the convergence speed. The analysis shows that, in radial networks, the best performance can be achieved when we command cooperation among units that are neighbors in the electric topology. Numerical simulations are included to validate the proposed model and to confirm the analytic results about the performance of the proposed algorithm.
ROOct 24, 2022
How Bad is Selfish Driving? Bounding the Inefficiency of Equilibria in Urban Driving GamesAlessandro Zanardi, Pier Giuseppe Sessa, Nando Käslin et al.
We consider the interaction among agents engaging in a driving task and we model it as general-sum game. This class of games exhibits a plurality of different equilibria posing the issue of equilibrium selection. While selecting the most efficient equilibrium (in term of social cost) is often impractical from a computational standpoint, in this work we study the (in)efficiency of any equilibrium players might agree to play. More specifically, we bound the equilibrium inefficiency by modeling driving games as particular type of congestion games over spatio-temporal resources. We obtain novel guarantees that refine existing bounds on the Price of Anarchy (PoA) as a function of problem-dependent game parameters. For instance, the relative trade-off between proximity costs and personal objectives such as comfort and progress. Although the obtained guarantees concern open-loop trajectories, we observe efficient equilibria even when agents employ closed-loop policies trained via decentralized multi-agent reinforcement learning.
SYMay 26
Incentive-Based Load Curtailment with Limited Information: A Bilevel Zeroth-Order Learning ApproachZhisen Jiang, Florian Dörfler, Saverio Bolognani
Incentive-based load curtailment unlocks critical demand-side flexibility but is hindered by the limited knowledge of private user parameters and the inherent nonsmoothness of responses due to physical device constraints. We address this via a constrained bilevel optimization framework and propose the Bi-ZOL (Bilevel Zeroth-Order Learning) algorithm. Unlike conventional black-box methods, Bi-ZOL exploits the bilevel structure to decompose the hypergradient, integrating the exact analytical information of the SO's objective with a zeroth-order estimate of the unknown response sensitivity. This structural decomposition-based learning method mathematically smoothes the nonsmooth response landscape and reduces hypergradient estimation error. We provide theoretical convergence guarantees to an approximate stationary point and demonstrate through simulations that Bi-ZOL achieves near-optimal performance.
SYMar 23
Towards Fair and Efficient allocation of Mobility-on-Demand resources through a Karma EconomyMatteo Cederle, Saverio Bolognani, Gian Antonio Susto
Mobility-on-demand systems like ride-hailing have transformed urban transportation, but they have also exacerbated socio-economic inequalities in access to these services, also due to surge pricing strategies. Although several fairness-aware frameworks have been proposed in smart mobility, they often overlook the temporal and situational variability of user urgency that shapes real-world transportation demands. This paper introduces a non-monetary, Karma-based mechanism that models endogenous urgency, allowing user time-sensitivity to evolve in response to system conditions as well as external factors. We develop a theoretical framework maintaining the efficiency and fairness guarantees of classical Karma economies, while accommodating this realistic user behavior modeling. Applied to a simplified simulated mobility-on-demand scenario, we provide a proof-of-concept illustration of the proposed framework, showing that it exhibits promising behavior in terms of system efficiency and equitable resource allocation, while acknowledging that a full treatment of realistic MoD complexity remains an important direction for future work.
GTMay 11
Towards Model-Free Learning in Dynamic Population Games: An Application to Karma EconomiesMatteo Cederle, Saverio Bolognani, Gian Antonio Susto
Dynamic Population Games (DPGs) provide a tractable framework for modeling strategic interactions in large populations of self-interested agents, and have been successfully applied to the design of Karma economies, a class of fair non-monetary resource allocation mechanisms. Despite their appealing theoretical properties, existing computational tools for DPGs assume full knowledge of the game model and operate in a centralized fashion, limiting their applicability in realistic settings where agents have access only to their own private experience. This paper takes a step towards addressing this gap by studying model-free equilibrium learning in Karma DPGs. First, we analyze the setting in which a novel agent joins a Karma DPG already at its Stationary Nash Equilibrium (SNE) and learns a policy via Deep Q-Networks (DQN) without knowledge of the game model. Leveraging recent convergence results for DQN, we establish a suboptimality bound consisting of a DQN approximation error of order $O(1/\sqrt{N_s})$ and a mean field perturbation error of order $O(1/N)$, where $N_s$ is the replay buffer size and $N$ is the population size. Second, we consider the challenging problem of learning the SNE from scratch. We show empirically that combining deep RL with fictitious play and smoothed policy iteration allows agents to converge, in a model-free fashion, to a configuration close to the centrally computed SNE. Together, these contributions support the vision of Karma economies as practical tools for fair resource allocation.
SYMay 5
A Welfarist Perspective on Fair Generation CurtailmentJonas G. Matt, Ilia Shilov, Saverio Bolognani
This paper presents a welfarist approach to fair active power curtailment in distribution grids with distributed photovoltaics. We address the lack of consistent axiomatic foundations in existing ad-hoc curtailment rules by modeling the decision as a social choice problem over feasible operating points and by deriving curtailment objectives from a set of foundational axioms that express principled stances on fairness and grid access rights. Rather than relying on the typically assumed full comparability of utilities, which can lead to undesirable outcomes in heterogeneous residential systems, we adopt a cardinal non-comparability stance on utilities. This approach requires far fewer assumptions about prosumers' private preferences while providing a rigorous basis for fair social ranking. We then present a unified framework that demonstrates that existing curtailment schemes represent specific instances of the Kalai-Smorodinsky rule applied to different normative reference points. This perspective offers grid operators an auditable, axiomatic foundation for justifying fairness in local energy systems.
SYApr 30
Optimal Functional Incentives for Control: The Linear-Quadratic Case with Bilinear IncentivesJonas G. Matt, Saverio Bolognani, Florian Dörfler
We study the design of functional incentive mechanisms for dynamical systems, in which a leader designs a fixed incentive function to motivate a self-interested follower to actuate the system beneficially over an extended horizon, without real-time revision of the incentive. This stands in contrast to the adaptive paradigm, in which the incentive is itself a continuously updated control variable. We formalize the problem as a discrete-time bi-level optimal control problem and derive analytical results for the linear-quadratic case with bilinear incentives and a myopic follower. Specifically, we establish a necessary and sufficient stability condition for the induced closed-loop system, derive a closed-form expression for the gradient of the expected leader cost with respect to the incentive parameter matrix, and obtain a fully closed-form cost expression in the scalar setting. Based on the latter, explicit characterizations of the optimal incentive parameter are provided in two asymptotic regimes: the infinite-horizon limit and the limit of high follower cost. For long horizons, the optimal incentive is shown to become independent of the follower's private cost parameter, with direct implications for robust mechanism design under private information.
GTApr 26
Strategically Robust Aggregative GamesAndreas Feik, Nicolas Lanzetti, Saverio Bolognani et al.
In many multiagent settings, such as electric vehicle charging and traffic routing, agents must make decisions in the face of uncertain behavior exhibited by others. Often, this uncertainty arises from multiple sources, such as incomplete information, limited computation, or bounded rationality, ultimately impacting the aggregate behavior. To tackle this challenge, we follow recent work on strategically robust game theory and postulate that agents seek protection directly against deviations around the emergent behavior, as opposed to explicitly modeling all sources of uncertainty. Specifically, we propose that each agent protects itself against the worst-case aggregate behavior within an optimal-transport-based ambiguity set centered at the emergent aggregate population behavior. This leads to a novel equilibrium concept, called strategically robust Wardrop equilibrium, that enables one to interpolate between standard Wardrop equilibria (no robustness) and security strategies (maximum robustness). In the setting of convex aggregative games, we establish the existence of a pure strategically robust Wardrop equilibrium and provide tractable computational tools for computing it. Through an application in electric vehicle charging, we demonstrate that strategically robust Wardrop equilibria lead to better decisions, protecting agents against the uncertain aggregate behavior of the population. Remarkably, we also observe that strategic robustness can lead to lower equilibrium costs for all agents, uncovering a "coordination-via-robustification" effect.
OCMar 10, 2025
Decision-Dependent Stochastic Optimization: The Role of Distribution DynamicsZhiyu He, Saverio Bolognani, Florian Dörfler et al.
Distribution shifts have long been regarded as troublesome external forces that a decision-maker should either counteract or conform to. An intriguing feedback phenomenon termed decision dependence arises when the deployed decision affects the environment and alters the data-generating distribution. In the realm of performative prediction, this is encoded by distribution maps parameterized by decisions due to strategic behaviors. In contrast, we formalize an endogenous distribution shift as a feedback process featuring nonlinear dynamics that couple the evolving distribution with the decision. Stochastic optimization in this dynamic regime provides a fertile ground to examine the various roles played by dynamics in the composite problem structure. To this end, we develop an online algorithm that achieves optimal decision-making by both adapting to and shaping the dynamic distribution. Throughout the paper, we adopt a distributional perspective and demonstrate how this view facilitates characterizations of distribution dynamics and the optimality and generalization performance of the proposed algorithm. We showcase the theoretical results in an opinion dynamics context, where an opportunistic party maximizes the affinity of a dynamic polarized population, and in a recommender system scenario, featuring performance optimization with discrete distributions in the probability simplex.
OCJan 25, 2024
Towards a Systems Theory of AlgorithmsFlorian Dörfler, Zhiyu He, Giuseppe Belgioioso et al.
Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control strategies, reinforcement learning, decision-making architectures, online optimization, and many more. Further, even {\em closed} algorithms in learning or optimization are increasingly abstracted in block diagrams with interacting dynamic modules and pipelines. In this opinion paper, we state our vision on a to-be-cultivated {\em systems theory of algorithms} and argue in favor of viewing algorithms as open dynamical systems interacting with other algorithms, physical systems, humans, or databases. Remarkably, the manifold tools developed under the umbrella of systems theory are well suited for addressing a range of challenges in the algorithmic domain. We survey various instances where the principles of algorithmic systems theory are being developed and outline pertinent modeling, analysis, and design challenges.
CYMay 10, 2023
A Classification of Feedback Loops and Their Relation to Biases in Automated Decision-Making SystemsNicolò Pagan, Joachim Baumann, Ezzat Elokda et al.
Prediction-based decision-making systems are becoming increasingly prevalent in various domains. Previous studies have demonstrated that such systems are vulnerable to runaway feedback loops, e.g., when police are repeatedly sent back to the same neighborhoods regardless of the actual rate of criminal activity, which exacerbate existing biases. In practice, the automated decisions have dynamic feedback effects on the system itself that can perpetuate over time, making it difficult for short-sighted design choices to control the system's evolution. While researchers started proposing longer-term solutions to prevent adverse outcomes (such as bias towards certain groups), these interventions largely depend on ad hoc modeling assumptions and a rigorous theoretical understanding of the feedback dynamics in ML-based decision-making systems is currently missing. In this paper, we use the language of dynamical systems theory, a branch of applied mathematics that deals with the analysis of the interconnection of systems with dynamic behaviors, to rigorously classify the different types of feedback loops in the ML-based decision-making pipeline. By reviewing existing scholarly work, we show that this classification covers many examples discussed in the algorithmic fairness community, thereby providing a unifying and principled framework to study feedback loops. By qualitative analysis, and through a simulation example of recommender systems, we show which specific types of ML biases are affected by each type of feedback loop. We find that the existence of feedback loops in the ML-based decision-making pipeline can perpetuate, reinforce, or even reduce ML biases.
MANov 13, 2021
Posetal Games: Efficiency, Existence, and Refinement of Equilibria in Games with Prioritized MetricsAlessandro Zanardi, Gioele Zardini, Sirish Srinivasan et al.
Modern applications require robots to comply with multiple, often conflicting rules and to interact with the other agents. We present Posetal Games as a class of games in which each player expresses a preference over the outcomes via a partially ordered set of metrics. This allows one to combine hierarchical priorities of each player with the interactive nature of the environment. By contextualizing standard game theoretical notions, we provide two sufficient conditions on the preference of the players to prove existence of pure Nash Equilibria in finite action sets. Moreover, we define formal operations on the preference structures and link them to a refinement of the game solutions, showing how the set of equilibria can be systematically shrunk. The presented results are showcased in a driving game where autonomous vehicles select from a finite set of trajectories. The results demonstrate the interpretability of results in terms of minimum-rank-violation for each player.
MAJul 22, 2019
Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma GamesAndrea Censi, Saverio Bolognani, Julian G. Zilly et al.
We present a new type of coordination mechanism among multiple agents for the allocation of a finite resource, such as the allocation of time slots for passing an intersection. We consider the setting where we associate one counter to each agent, which we call karma value, and where there is an established mechanism to decide resource allocation based on agents exchanging karma. The idea is that agents might be inclined to pass on using resources today, in exchange for karma, which will make it easier for them to claim the resource use in the future. To understand whether such a system might work robustly, we only design the protocol and not the agents' policies. We take a game-theoretic perspective and compute policies corresponding to Nash equilibria for the game. We find, surprisingly, that the Nash equilibria for a society of self-interested agents are very close in social welfare to a centralized cooperative solution. These results suggest that many resource allocation problems can have a simple, elegant, and robust solution, assuming the availability of a karma accounting mechanism.