Antonios Varvitsiotis

OC
Semantic Scholar Profile
h-index12
14papers
80citations
Novelty61%
AI Score56

14 Papers

58.0GTMay 24
Solving Imperfect-Recall Games via Sum-of-Squares Optimization

Rui Zheng, Ryann Sim, Antonios Varvitsiotis

Extensive-form games (EFGs) provide a powerful framework for modeling sequential decision making, capturing strategic interaction under imperfect information, chance events, and temporal structure. Most positive algorithmic and theoretical results for EFGs assume perfect recall, where players remember all past information and actions. We study the increasingly relevant setting of imperfect-recall EFGs (IREFGs), where players may forget parts of their history or previously acquired information, and where equilibrium computation is provably hard. We propose sum-of-squares (SOS) hierarchies for computing ex-ante optimal strategies in single-player IREFGs and Nash equilibria in multi-player IREFGs, working over behavioral strategies. Our theoretical results show that (i) these hierarchies converge asymptotically, (ii) under genericity assumptions, the convergence is finite, and (iii) in single-player non-absentminded IREFGs, convergence occurs at a finite level determined by the number of information sets. Finally, we introduce the new classes of (SOS)-concave and (SOS)-monotone IREFGs, and show that in the single-player setting the SOS hierarchy converges at the first level, enabling equilibrium computation with a single semidefinite program (SDP).

OCJul 6, 2023
Multiplicative Updates for Online Convex Optimization over Symmetric Cones

Ilayda Canyakmaz, Wayne Lin, Georgios Piliouras et al.

We study online convex optimization where the possible actions are trace-one elements in a symmetric cone, generalizing the extensively-studied experts setup and its quantum counterpart. Symmetric cones provide a unifying framework for some of the most important optimization models, including linear, second-order cone, and semidefinite optimization. Using tools from the field of Euclidean Jordan Algebras, we introduce the Symmetric-Cone Multiplicative Weights Update (SCMWU), a projection-free algorithm for online optimization over the trace-one slice of an arbitrary symmetric cone. We show that SCMWU is equivalent to Follow-the-Regularized-Leader and Online Mirror Descent with symmetric-cone negative entropy as regularizer. Using this structural result we show that SCMWU is a no-regret algorithm, and verify our theoretical results with extensive experiments. Our results unify and generalize the analysis for the Multiplicative Weights Update method over the probability simplex and the Matrix Multiplicative Weights Update method over the set of density matrices.

51.0LGMay 25
Online Learning on Hidden-Convex Losses via Algorithmic Equivalence: Optimal Regret, Geometric Barrier, and Bandit Feedback

Anas Barakat, Andreas Kontogiannis, Vasilis Pollatos et al.

We study adversarial online learning with hidden-convex losses, i.e., nonconvex losses that become convex after a nonlinear reparameterization. Ghai, Lu and Hazan (2022) proved that, under geometric and smoothness assumptions, online gradient descent (OGD) on such nonconvex losses approximately simulates online mirror descent (OMD) on the underlying convex losses with a suitable regularizer, yielding $\mathcal{O}(T^{2/3})$ regret. They left open whether the optimal $Θ(\sqrt{T})$ regret from online convex optimization can be recovered in this hidden-convex setting. We answer this question affirmatively. More specifically, via a sharper discrete-time algorithmic equivalence argument, we prove that OGD achieves $\mathcal{O}(\sqrt{T})$ regret under the same assumptions, matching the optimal worst-case rate for adversarial online convex optimization. We also address another open question of Ghai, Lu and Hazan (2022) by clarifying the geometry required for this algorithmic equivalence. We replace the diagonal-Jacobian sufficient condition with a necessary-and-sufficient Hessian compatibility condition, thereby expanding the class of admissible reparameterizations. We complement our tight regret bound with a lower bound showing that the Hessian compatibility assumption is essential for OGD; when it fails, we construct a smooth reparameterization and an adversarial sequence of hidden-convex losses for which OGD suffers $Ω(T)$ regret. Finally, we extend our analysis to one-point bandit feedback and prove a $\mathcal{O}(T^{3/4})$ expected regret bound for bandit OGD with spherical smoothing, matching its classical rate on convex losses.

GTJul 13, 2023
Data-Scarce Identification of Game Dynamics via Sum-of-Squares Optimization

Iosif Sakos, Antonios Varvitsiotis, Georgios Piliouras

Understanding how players adjust their strategies in games, based on their experience, is a crucial tool for policymakers. It enables them to forecast the system's eventual behavior, exert control over the system, and evaluate counterfactual scenarios. The task becomes increasingly difficult when only a limited number of observations are available or difficult to acquire. In this work, we introduce the Side-Information Assisted Regression (SIAR) framework, designed to identify game dynamics in multiplayer normal-form games only using data from a short run of a single system trajectory. To enhance system recovery in the face of scarce data, we integrate side-information constraints into SIAR, which restrict the set of feasible solutions to those satisfying game-theoretic properties and common assumptions about strategic interactions. SIAR is solved using sum-of-squares (SOS) optimization, resulting in a hierarchy of approximations that provably converge to the true dynamics of the system. We showcase that the SIAR framework accurately predicts player behavior across a spectrum of normal-form games, widely-known families of game dynamics, and strong benchmarks, even if the unknown system is chaotic.

GTFeb 12
Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria

Anas Barakat, Ioannis Panageas, Antonios Varvitsiotis

Convex Markov Games (cMGs) were recently introduced as a broad class of multi-agent learning problems that generalize Markov games to settings where strategic agents optimize general utilities beyond additive rewards. While cMGs expand the modeling frontier, their theoretical foundations, particularly the structure of Nash equilibria (NE) and guarantees for learning algorithms, are not yet well understood. In this work, we address these gaps for an extension of cMGs, which we term General Utility Markov Games (GUMGs), capturing new applications requiring coupling between agents' occupancy measures. We prove that in GUMGs, Nash equilibria coincide with the fixed points of projected pseudo-gradient dynamics (i.e., first-order stationary points), enabled by a novel agent-wise gradient domination property. This insight also yields a simple proof of NE existence using Brouwer's fixed-point theorem. We further show the existence of Markov perfect equilibria. Building on this characterization, we establish a policy gradient theorem for GUMGs and design a model-free policy gradient algorithm. For potential GUMGs, we establish iteration complexity guarantees for computing approximate-NE under exact gradients and provide sample complexity bounds in both the generative model and on-policy settings. Our results extend beyond prior work restricted to zero-sum cMGs, providing the first theoretical analysis of common-interest cMGs.

59.4SYMay 15
Fairness-Guaranteed Online Power Allocation Policies for EV Fast Charging Stations

Can Berk Saner, Yong-Sheng Soh, Antonios Varvitsiotis

The rapid expansion of electric vehicles (EVs) necessitates scalable and efficient fast charging station (FCS) infrastructure. These stations often operate in oversubscribed configurations where the total port rating exceeds a station-level cap reflecting infrastructure limits, grid constraints or market setpoints. In such settings, ensuring fairness in real-time power allocation is essential to prevent user bias and secure equitable access to limited resources while maximizing infrastructure utilization. This task is further complicated by state-of-charge dependent EV power limits defined by charge curves, for which accurate data is often unavailable. This paper introduces two fairness-guaranteed online power allocation policies: FAIR-OPAP-C for conventional FCSs with continuously adjustable power delivery, and FAIR-OPAP-M for modular FCSs composed of discrete assignable power modules. Unlike existing methods, these algorithms require no prior knowledge of charge curves, utilizing only instantaneous power requests available via standard protocols. We formalize fairness with a unified framework encompassing envy-freeness, Pareto efficiency, and proportionality, and establish theoretical guarantees for both algorithms. The algorithms rely on lightweight operations, achieving near-linear and logarithmic scalability for the conventional and modular cases, respectively. Comprehensive evaluations show the proposed methods achieve superior performance across various metrics among seven benchmarks from EV charging and fair division literature. Furthermore, they are orders of magnitude faster than optimization-based approaches, with runtimes below 1 ms for up to 300 EVs, validating their suitability for real-time deployment on hardware-constrained edge devices.

79.6GTMay 13
When and Why is Optimistic Multiplicative Weights Slow? The Geometry of Energy Dissipation

John Lazarsfeld, Anas Barakat, Georgios Piliouras et al.

This paper studies the convergence of the Optimistic Multiplicative Weights Update algorithm (OMWU) in two player zero-sum games. Recent works have identified instances on which the last-iterate of OMWU can converge arbitrarily slowly, but understanding when and why this slow convergence occurs has remained open. In this work, we develop a new analysis framework that gives sharp, quantitative explanations for this behavior. Our analysis is based on viewing the algorithm's dual iterates as an optimistic skew-gradient descent with respect to an energy function. We prove over the dual iterates that energy is dissipative, and by establishing tight bounds on the magnitude of dissipation, our analysis quantifies the geometric bottlenecks that arise when the corresponding primal iterates are close to the simplex boundary. This further translates into a new linear last-iterate convergence rate in KL divergence on games with a unique and interior Nash equilibrium. Compared to prior work, this new rate contains a much sharper dependence on game-specific constants, and we prove this dependence is optimal. Moreover, these geometric insights further translate into new separations on uniform convergence rates for OMWU. On the one hand, we prove constant lower bounds on the uniform best-iterate convergence rate in KL divergence and total variation distance from Nash. On the other hand, we establish for the $2\times 2$ setting a new ${\widetilde O}(T^{-1/2})$ best-iterate rate in duality gap, improving substantially over prior work. Together, this shows in general that uniform convergence rate guarantees do not transfer across different measures of distance to Nash.

LGJun 23, 2025
Online Multi-Agent Control with Adversarial Disturbances

Anas Barakat, John Lazarsfeld, Georgios Piliouras et al.

Online multi-agent control problems, where many agents pursue competing and time-varying objectives, are widespread in domains such as autonomous robotics, economics, and energy systems. In these settings, robustness to adversarial disturbances is critical. In this paper, we study online control in multi-agent linear dynamical systems subject to such disturbances. In contrast to most prior work in multi-agent control, which typically assumes noiseless or stochastically perturbed dynamics, we consider an online setting where disturbances can be adversarial, and where each agent seeks to minimize its own sequence of convex losses. Under two feedback models, we analyze online gradient-based controllers with local policy updates. We prove per-agent regret bounds that are sublinear and near-optimal in the time horizon and that highlight different scalings with the number of agents. When agents' objectives are aligned, we further show that the multi-agent control problem induces a time-varying potential game for which we derive equilibrium tracking guarantees. Together, our results take a first step in bridging online control with online learning in games, establishing robust individual and collective performance guarantees in dynamic continuous-state environments.

OCApr 4, 2025
Optimistic Online Learning in Symmetric Cone Games

Anas Barakat, Wayne Lin, John Lazarsfeld et al.

We introduce symmetric cone games (SCGs), a broad class of multi-player games where each player's strategy lies in a generalized simplex (the trace-one slice of a symmetric cone). This framework unifies a wide spectrum of settings, including normal-form games (simplex strategies), quantum games (density matrices), and continuous games with ball-constrained strategies. It also captures several structured machine learning and optimization problems, such as distance metric learning and Fermat-Weber facility location, as two-player zero-sum SCGs. To compute approximate Nash equilibria in two-player zero-sum SCGs, we propose a single online learning algorithm: Optimistic Symmetric Cone Multiplicative Weights Updates (OSCMWU). Unlike prior methods tailored to specific geometries, OSCMWU provides closed-form, projection-free updates over any symmetric cone and achieves an optimal $\tilde{\mathcal{O}}(1/ε)$ iteration complexity for computing $ε$-saddle points. Our analysis builds on the Optimistic Follow-the-Regularized-Leader framework and hinges on a key technical contribution: We prove that the symmetric cone negative entropy is strongly convex with respect to the trace-one norm. This result extends known results for the simplex and spectraplex to all symmetric cones, and may be of independent interest.

OCAug 2, 2021
Multiplicative updates for symmetric-cone factorizations

Yong Sheng Soh, Antonios Varvitsiotis

Given a matrix $X\in \mathbb{R}^{m\times n}_+$ with non-negative entries, the cone factorization problem over a cone $\mathcal{K}\subseteq \mathbb{R}^k$ concerns computing $\{ a_1,\ldots, a_{m} \} \subseteq \mathcal{K}$ and $\{ b_1,\ldots, b_{n} \} \subseteq~\mathcal{K}^*$ belonging to its dual so that $X_{ij} = \langle a_i, b_j \rangle$ for all $i\in [m], j\in [n]$. Cone factorizations are fundamental to mathematical optimization as they allow us to express convex bodies as feasible regions of linear conic programs. In this paper, we introduce and analyze the symmetric-cone multiplicative update (SCMU) algorithm for computing cone factorizations when $\mathcal{K}$ is symmetric; i.e., it is self-dual and homogeneous. Symmetric cones are of central interest in mathematical optimization as they provide a common language for studying linear optimization over the nonnegative orthant (linear programs), over the second-order cone (second order cone programs), and over the cone of positive semidefinite matrices (semidefinite programs). The SCMU algorithm is multiplicative in the sense that the iterates are updated by applying a meticulously chosen automorphism of the cone computed using a generalization of the geometric mean to symmetric cones. Using an extension of Lieb's concavity theorem and von Neumann's trace inequality to symmetric cones, we show that the squared loss objective is non-decreasing along the trajectories of the SCMU algorithm. Specialized to the nonnegative orthant, the SCMU algorithm corresponds to the seminal algorithm by Lee and Seung for computing Nonnegative Matrix Factorizations.

OCJun 1, 2021
A Non-commutative Extension of Lee-Seung's Algorithm for Positive Semidefinite Factorizations

Yong Sheng Soh, Antonios Varvitsiotis

Given a matrix $X\in \mathbb{R}_+^{m\times n}$ with nonnegative entries, a Positive Semidefinite (PSD) factorization of $X$ is a collection of $r \times r$-dimensional PSD matrices $\{A_i\}$ and $\{B_j\}$ satisfying $X_{ij}= \mathrm{tr}(A_i B_j)$ for all $\ i\in [m],\ j\in [n]$. PSD factorizations are fundamentally linked to understanding the expressiveness of semidefinite programs as well as the power and limitations of quantum resources in information theory. The PSD factorization task generalizes the Non-negative Matrix Factorization (NMF) problem where we seek a collection of $r$-dimensional nonnegative vectors $\{a_i\}$ and $\{b_j\}$ satisfying $X_{ij}= a_i^\top b_j$, for all $i\in [m],\ j\in [n]$ -- one can recover the latter problem by choosing matrices in the PSD factorization to be diagonal. The most widely used algorithm for computing NMFs of a matrix is the Multiplicative Update algorithm developed by Lee and Seung, in which nonnegativity of the updates is preserved by scaling with positive diagonal matrices. In this paper, we describe a non-commutative extension of Lee-Seung's algorithm, which we call the Matrix Multiplicative Update (MMU) algorithm, for computing PSD factorizations. The MMU algorithm ensures that updates remain PSD by congruence scaling with the matrix geometric mean of appropriate PSD matrices, and it retains the simplicity of implementation that Lee-Seung's algorithm enjoys. Building on the Majorization-Minimization framework, we show that under our update scheme the squared loss objective is non-increasing and fixed points correspond to critical points. The analysis relies on Lieb's Concavity Theorem. Beyond PSD factorizations, we use the MMU algorithm as a primitive to calculate block-diagonal PSD factorizations and tensor PSD factorizations. We demonstrate the utility of our method with experiments on real and synthetic data.

LGFeb 26, 2020
Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently

Ioannis Panageas, Stratis Skoulakis, Antonios Varvitsiotis et al.

Non-negative matrix factorization (NMF) is a fundamental non-convex optimization problem with numerous applications in Machine Learning (music analysis, document clustering, speech-source separation etc). Despite having received extensive study, it is poorly understood whether or not there exist natural algorithms that can provably converge to a local minimum. Part of the reason is because the objective is heavily symmetric and its gradient is not Lipschitz. In this paper we define a multiplicative weight update type dynamics (modification of the seminal Lee-Seung algorithm) that runs concurrently and provably avoids saddle points (first order stationary points that are not second order). Our techniques combine tools from dynamical systems such as stability and exploit the geometry of the NMF objective by reducing the standard NMF formulation over the non-negative orthant to a new formulation over (a scaled) simplex. An important advantage of our method is the use of concurrent updates, which permits implementations in parallel computing environments.

OCJun 11, 2019
Analysis of Optimization Algorithms via Sum-of-Squares

Sandra S. Y. Tan, Antonios Varvitsiotis, Vincent Y. F. Tan

We introduce a new framework for unifying and systematizing the performance analysis of first-order black-box optimization algorithms for unconstrained convex minimization. The low-cost iteration complexity enjoyed by first-order algorithms renders them particularly relevant for applications in machine learning and large-scale data analysis. Relying on sum-of-squares (SOS) optimization, we introduce a hierarchy of semidefinite programs that give increasingly better convergence bounds for higher levels of the hierarchy. Alluding to the power of the SOS hierarchy, we show that the (dual of the) first level corresponds to the Performance Estimation Problem (PEP) introduced by Drori and Teboulle [Math. Program., 145(1):451--482, 2014], a powerful framework for determining convergence rates of first-order optimization algorithms. Consequently, many results obtained within the PEP framework can be reinterpreted as degree-1 SOS proofs, and thus, the SOS framework provides a promising new approach for certifying improved rates of convergence by means of higher-order SOS certificates. To determine analytical rate bounds, in this work we use the first level of the SOS hierarchy and derive new result{s} for noisy gradient descent with inexact line search methods (Armijo, Wolfe, and Goldstein).

QUANT-PHMar 13, 2018
Characterising the correlations of prepare-and-measure quantum networks

Yukun Wang, Ignatius William Primaatmaja, Emilien Lavie et al.

Prepare-and-measure (P&M) quantum networks are the basic building blocks of quantum communication and cryptography. These networks crucially rely on non-orthogonal quantum encodings to distribute quantum correlations, thus enabling superior communication rates and information-theoretic security. Here, we present a computational toolbox that is able to efficiently characterise the set of input-output probability distributions for any discrete-variable P&M quantum network, assuming only the inner-product information of the quantum encodings. Our toolbox is thus highly versatile and can be used to analyse a wide range of quantum network protocols, including those that employ infinite-dimensional quantum code states. To demonstrate the feasibility and efficacy of our toolbox, we use it to reveal new results in multipartite quantum distributed computing and quantum cryptography. Taken together, these findings suggest that our method may have implications for quantum network information theory and the development of new quantum technologies.