87.9GTJun 1
The Price of Decentralization in Block BuildingBurak Öz, Fei Wu, Luis Correia et al.
Decentralized block building mechanisms replace the monopoly of a single proposer with multiple builders. However, their censorship-resistance and fair-access benefits depend not only on the number of builders, but also on whether builders are geographically positioned to provide timely transaction coverage. We study this tension between builder location choice, user transaction coverage, and reward concentration by modeling decentralized block building as a stochastic coverage game. Builders choose regions, information sources emit transactions over a block construction round, and latency determines whether each transaction is received before the deadline. We show that the builder region game is an exact potential game and therefore admits a pure Nash equilibrium. We prove an asymptotically tight factor-2 Price of Anarchy bound, quantifying the price of decentralization from uncoordinated builder placement, and derive tight bounds on builder utility concentration, showing that the lowest-utility builder earns at least half of the highest-utility builder's payoff, and the utility-share HHI is at most 12.5% above the egalitarian benchmark. We complement the theory with simulations, studying the builder region game under richer latency and source environments. We find that welfare losses are most pronounced in intermediate regimes where peripheral sources are reachable and valuable, but selfish incentives still favor regions with strong access to high-value sources. We also find that geographic and utility concentration need not align: planner allocations can improve coverage by assigning builders to lower-payoff peripheral regions, while equilibrium outcomes can be more geographically concentrated but more utility-balanced. We connect our findings to protocol design and discuss future directions on location-market modeling and alternative reward-sharing rules.
LGNov 3, 2023
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction EstimationDaiki E. Matsunaga, Jongmin Lee, Jaeseok Yoon et al.
One of the main challenges in offline Reinforcement Learning (RL) is the distribution shift that arises from the learned policy deviating from the data collection policy. This is often addressed by avoiding out-of-distribution (OOD) actions during policy improvement as their presence can lead to substantial performance degradation. This challenge is amplified in the offline Multi-Agent RL (MARL) setting since the joint action space grows exponentially with the number of agents. To avoid this curse of dimensionality, existing MARL methods adopt either value decomposition methods or fully decentralized training of individual agents. However, even when combined with standard conservatism principles, these methods can still result in the selection of OOD joint actions in offline MARL. To this end, we introduce AlberDICE, an offline MARL algorithm that alternatively performs centralized training of individual agents based on stationary distribution optimization. AlberDICE circumvents the exponential complexity of MARL by computing the best response of one agent at a time while effectively avoiding OOD joint action selection. Theoretically, we show that the alternating optimization procedure converges to Nash policies. In the experiments, we demonstrate that AlberDICE significantly outperforms baseline algorithms on a standard suite of MARL benchmarks.
LGApr 18, 2025
Bitcoin's Edge: Embedded Sentiment in Blockchain Transactional DataCharalampos Kleitsikas, Nikolaos Korfiatis, Stefanos Leonardos et al.
Cryptocurrency blockchains, beyond their primary role as distributed payment systems, are increasingly used to store and share arbitrary content, such as text messages and files. Although often non-financial, this hidden content can impact price movements by conveying private information, shaping sentiment, and influencing public opinion. However, current analyses of such data are limited in scope and scalability, primarily relying on manual classification or hand-crafted heuristics. In this work, we address these limitations by employing Natural Language Processing techniques to analyze, detect patterns, and extract public sentiment encoded within blockchain transactional data. Using a variety of Machine Learning techniques, we showcase for the first time the predictive power of blockchain-embedded sentiment in forecasting cryptocurrency price movements on the Bitcoin and Ethereum blockchains. Our findings shed light on a previously underexplored source of freely available, transparent, and immutable data and introduce blockchain sentiment analysis as a novel and robust framework for enhancing financial predictions in cryptocurrency markets. Incidentally, we discover an asymmetry between cryptocurrencies; Bitcoin has an informational advantage over Ethereum in that the sentiment embedded into transactional data is sufficient to predict its price movement.
GTJun 24, 2021
Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded RationalityStefanos Leonardos, Georgios Piliouras, Kelly Spendlove
The interplay between exploration and exploitation in competitive multi-agent learning is still far from being well understood. Motivated by this, we study smooth Q-learning, a prototypical learning model that explicitly captures the balance between game rewards and exploration costs. We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality, in weighted zero-sum polymatrix games with heterogeneous learning agents using positive exploration rates. Complementing recent results about convergence in weighted potential games, we show that fast convergence of Q-learning in competitive settings is obtained regardless of the number of agents and without any need for parameter fine-tuning. As showcased by our experiments in network zero-sum games, these theoretical results provide the necessary guarantees for an algorithmic approach to the currently open problem of equilibrium selection in competitive multi-agent settings.
LGJun 3, 2021
Global Convergence of Multi-Agent Policy Gradient in Markov Potential GamesStefanos Leonardos, Will Overman, Ioannis Panageas et al.
Potential games are arguably one of the most important and widely studied classes of normal form games. They define the archetypal setting of multi-agent coordination as all agent utilities are perfectly aligned with each other via a common potential function. Can this intuitive framework be transplanted in the setting of Markov Games? What are the similarities and differences between multi-agent coordination with and without state dependence? We present a novel definition of Markov Potential Games (MPG) that generalizes prior attempts at capturing complex stateful multi-agent coordination. Counter-intuitively, insights from normal-form potential games do not carry over as MPGs can consist of settings where state-games can be zero-sum games. In the opposite direction, Markov games where every state-game is a potential game are not necessarily MPGs. Nevertheless, MPGs showcase standard desirable properties such as the existence of deterministic Nash policies. In our main technical result, we prove fast convergence of independent policy gradient to Nash policies by adapting recent gradient dominance property arguments developed for single agent MDPs to multi-agent learning settings.
GTFeb 21, 2021
Dynamical Analysis of the EIP-1559 Ethereum Fee MarketStefanos Leonardos, Barnabé Monnot, Daniël Reijsbergen et al.
Participation in permissionless blockchains results in competition over system resources, which needs to be controlled with fees. Ethereum's current fee mechanism is implemented via a first-price auction that results in unpredictable fees as well as other inefficiencies. EIP-1559 is a recent, improved proposal that introduces a number of innovative features such as a dynamically adaptive base fee that is burned, instead of being paid to the miners. Despite intense interest in understanding its properties, several basic questions such as whether and under what conditions does this protocol self-stabilize have remained elusive thus far. We perform a thorough analysis of the resulting fee market dynamic mechanism via a combination of tools from game theory and dynamical systems. We start by providing bounds on the step-size of the base fee update rule that suffice for global convergence to equilibrium via Lyapunov arguments. In the negative direction, we show that for larger step-sizes instability and even formally chaotic behavior are possible under a wide range of settings. We complement these qualitative results with quantitative bounds on the resulting range of base fees. We conclude our analysis with a thorough experimental case study that corroborates our theoretical findings.
CRJun 15, 2019
PREStO: A Systematic Framework for Blockchain Consensus ProtocolsStefanos Leonardos, Daniel Reijsbergen, Georgios Piliouras
The rapid evolution of blockchain technology has brought together stakeholders from fundamentally different backgrounds. The result is a diverse ecosystem, as exemplified by the development of a wide range of different blockchain protocols. This raises questions for decision and policy makers: How do different protocols compare? What are their trade-offs? Existing efforts to survey the area reveal a fragmented terminology and the lack of a unified framework to reason about the properties of blockchain protocols. In this paper, we work towards bridging this gap. We present a five-dimensional design space with a modular structure in which protocols can be compared and understood. Based on these five axes -- Optimality, Stability, Efficiency, Robustness and Persistence -- we organize the properties of existing protocols in subcategories of increasing granularity. The result is a dynamic scheme -- termed the PREStO framework -- which aids the interaction between stakeholders of different backgrounds, including managers and investors, and which enables systematic reasoning about blockchain protocols. We illustrate its value by comparing existing protocols and identifying research challenges, hence making a first step towards understanding the blockchain ecosystem through a more comprehensive lens.
GTApr 4, 2019
Oceanic Games: Centralization Risks and Incentives in Blockchain MiningNikos Leonardos, Stefanos Leonardos, Georgios Piliouras
To participate in the distributed consensus of permissionless blockchains, prospective nodes -- or miners -- provide proof of designated, costly resources. However, in contrast to the intended decentralization, current data on blockchain mining unveils increased concentration of these resources in a few major entities, typically mining pools. To study strategic considerations in this setting, we employ the concept of Oceanic Games, Milnor and Shapley (1978). Oceanic Games have been used to analyze decision making in corporate settings with small numbers of dominant players (shareholders) and large numbers of individually insignificant players, the ocean. Unlike standard equilibrium models, they focus on measuring the value (or power) per entity and per unit of resource} in a given distribution of resources. These values are viewed as strategic components in coalition formations, mergers and resource acquisitions. Considering such issues relevant to blockchain governance and long-term sustainability, we adapt oceanic games to blockchain mining and illustrate the defined concepts via examples. The application of existing results reveals incentives for individual miners to merge in order to increase the value of their resources. This offers an alternative perspective to the observed centralization and concentration of mining power. Beyond numerical simulations, we use the model to identify issues relevant to the design of future cryptocurrencies and formulate prospective research questions.
GTMar 11, 2019
Weighted Voting on the Blockchain: Improving Consensus in Proof of Stake ProtocolsStefanos Leonardos, Daniel Reijsbergen, Georgios Piliouras
Proof of Stake (PoS) protocols rely on voting mechanisms to reach consensus on the current state. If an enhanced majority of staking nodes, also called validators, agree on a proposed block, then this block is appended to the blockchain. Yet, these protocols remain vulnerable to faults caused by validators who abstain either accidentally or maliciously. To protect against such faults while retaining the PoS selection and reward allocation schemes, we study weighted voting in validator committees. We formalize the block creation process and introduce validators' voting profiles which we update by a multiplicative weights algorithm relative to validators' voting behavior and aggregate blockchain rewards. Using this framework, we leverage weighted majority voting rules that optimize collective decision making to show, both numerically and analytically, that the consensus mechanism is more robust if validators' votes are appropriately scaled. We raise potential issues and limitations of weighted voting in trustless, decentralized networks and relate our results to the design of current PoS protocols.
CRMar 11, 2019
Incentives in Ethereum's Hybrid Casper ProtocolVitalik Buterin, Daniel Reijsbergen, Stefanos Leonardos et al.
We present an overview of hybrid Casper the Friendly Finality Gadget (FFG): a Proof-of-Stake checkpointing protocol overlaid onto Ethereum's Proof-of-Work blockchain. We describe its core functionalities and reward scheme, and explore its properties. Our findings indicate that Casper's implemented incentives mechanism ensures liveness, while providing safety guarantees that improve over standard Proof-of-Work protocols. Based on a minimal-impact implementation of the protocol as a smart contract on the blockchain, we discuss additional issues related to parametrisation, funding, throughput and network overhead and detect potential limitations.