GTMay 21
Single-Item Auctions with a Monopolist IntermediaryJingyi Liu, Aviad Rubinstein, Ertem Nusret Tas et al.
Classical optimal auction theory assumes that bids reach the seller directly. We study how this picture changes when a revenue-maximizing intermediary controls access to the seller's auction. Motivated by blockchain auctions, online platforms, and other intermediated markets, we consider a single-item auction with independent private values and a monopolist intermediary who can decide which bidder messages are forwarded to the seller. We establish approximation guarantees and impossibility results across three timing models: seller-first, intermediary-first, and simultaneous. In the seller-first model, arbitrary deterministic seller mechanisms collapse to posted-price mechanisms, and the intermediary's best response is a shifted Myerson auction. This yields a sharp separation: for regular distributions, the seller's revenue can be arbitrarily small relative to the no-intermediary optimum, while for $α$-strongly regular distributions, posted prices recover a constant fraction of the optimum with a tight dependence on $α$. We further show that timing matters: neither Stackelberg order uniformly dominates, and simultaneous play can leave both parties unboundedly worse off than in either sequential model.
GTMay 7
Adversarial procurement in blockchainsMaryam Bahrani, Michael Neuder, S. Matthew Weinberg
An emerging blockchain protocol design pattern leverages the asymmetry between the computational effort in performing versus verifying tasks. For example, cryptographic validity proofs (e.g., SNARKS) require the prover to expend significant effort demonstrating the correctness of their claim, while the verifiers benefit from extremely easy validation. The operationalization of this paradigm requires efficiently soliciting the performance of expensive tasks in pseudonymous, adversarial environments. We formalize this as a mechanism design question. The protocol balances the economic cost of a liveness fault, where the work is not completed, with the payments required to incentivize specific behavior from candidate suppliers. We show that the loss of the optimal protocol scales logarithmically in the cost of a liveness fault, scaled up by the adversarial fraction of the network. Further, we find that the optimal equilibria have an intuitive structure, allowing us to provide concrete advice to practitioners. Specifically, in many regimes, the optimum designates a single, random node as the primary worker and a committee as a fallback, which is reminiscent of leader-based consensus mechanisms. We also characterize the asymptotic regimes where having negative payments (i.e., slashing in blockchain parlance) is especially helpful.
GTJan 29, 2024
Contracting with a Learning AgentGuru Guruganesh, Yoav Kolumbus, Jon Schneider et al.
Many real-life contractual relations differ completely from the clean, static model at the heart of principal-agent theory. Typically, they involve repeated strategic interactions of the principal and agent, taking place under uncertainty and over time. While appealing in theory, players seldom use complex dynamic strategies in practice, often preferring to circumvent complexity and approach uncertainty through learning. We initiate the study of repeated contracts with a learning agent, focusing on agents who achieve no-regret outcomes. Optimizing against a no-regret agent is a known open problem in general games; we achieve an optimal solution to this problem for a canonical contract setting, in which the agent's choice among multiple actions leads to success/failure. The solution has a surprisingly simple structure: for some $α> 0$, initially offer the agent a linear contract with scalar $α$, then switch to offering a linear contract with scalar $0$. This switch causes the agent to ``free-fall'' through their action space and during this time provides the principal with non-zero reward at zero cost. Despite apparent exploitation of the agent, this dynamic contract can leave \emph{both} players better off compared to the best static contract. Our results generalize beyond success/failure, to arbitrary non-linear contracts which the principal rescales dynamically. Finally, we quantify the dependence of our results on knowledge of the time horizon, and are the first to address this consideration in the study of strategizing against learning agents.
GTJul 8, 2021
Proof-of-Stake Mining Games with Perfect RandomnessMatheus V. X. Ferreira, S. Matthew Weinberg
Proof-of-Stake blockchains based on a longest-chain consensus protocol are an attractive energy-friendly alternative to the Proof-of-Work paradigm. However, formal barriers to "getting the incentives right" were recently discovered, driven by the desire to use the blockchain itself as a source of pseudorandomness \cite{brown2019formal}. We consider instead a longest-chain Proof-of-Stake protocol with perfect, trusted, external randomness (e.g. a randomness beacon). We produce two main results. First, we show that a strategic miner can strictly outperform an honest miner with just $32.5\%$ of the total stake. Note that a miner of this size {\em cannot} outperform an honest miner in the Proof-of-Work model. This establishes that even with access to a perfect randomness beacon, incentives in Proof-of-Work and Proof-of-Stake longest-chain protocols are fundamentally different. Second, we prove that a strategic miner cannot outperform an honest miner with $30.8\%$ of the total stake. This means that, while not quite as secure as the Proof-of-Work regime, desirable incentive properties of Proof-of-Work longest-chain protocols can be approximately recovered via Proof-of-Stake with a perfect randomness beacon. The space of possible strategies in a Proof-of-Stake mining game is {\em significantly} richer than in a Proof-of-Work game. Our main technical contribution is a characterization of potentially optimal strategies for a strategic miner, and in particular, a proof that the corresponding infinite-state MDP admits an optimal strategy that is positive recurrent.
LGJul 5, 2020
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic TransactionsMichael Chang, Sidhant Kaushik, S. Matthew Weinberg et al.
This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
GTJun 10, 2020
Auction learning as a two-player gameJad Rahme, Samy Jelassi, S. Matthew Weinberg
Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. While theoretical approaches to the problem have hit some limits, a recent research direction initiated by Duetting et al. (2019) consists in building neural network architectures to find optimal auctions. We propose two conceptual deviations from their approach which result in enhanced performance. First, we use recent results in theoretical auction design (Rubinstein and Weinberg, 2018) to introduce a time-independent Lagrangian. This not only circumvents the need for an expensive hyper-parameter search (as in prior work), but also provides a principled metric to compare the performance of two auctions (absent from prior work). Second, the optimization procedure in previous work uses an inner maximization loop to compute optimal misreports. We amortize this process through the introduction of an additional neural network. We demonstrate the effectiveness of our approach by learning competitive or strictly improved auctions compared to prior work. Both results together further imply a novel formulation of Auction Design as a two-player game with stationary utility functions.
GTApr 3, 2020
Credible, Truthful, and Two-Round (Optimal) Auctions via Cryptographic CommitmentsMatheus V. X. Ferreira, S. Matthew Weinberg
We consider the sale of a single item to multiple buyers by a revenue-maximizing seller. Recent work of Akbarpour and Li formalizes \emph{credibility} as an auction desideratum, and prove that the only optimal, credible, strategyproof auction is the ascending price auction with reserves (Akbarpour and Li, 2019). In contrast, when buyers' valuations are MHR, we show that the mild additional assumption of a cryptographically secure commitment scheme suffices for a simple \emph{two-round} auction which is optimal, strategyproof, and credible (even when the number of bidders is only known by the auctioneer). We extend our analysis to the case when buyer valuations are $α$-strongly regular for any $α> 0$, up to arbitrary $\varepsilon$ in credibility. Interestingly, we also prove that this construction cannot be extended to regular distributions, nor can the $\varepsilon$ be removed with multiple bidders.
GTMar 2, 2020
A Permutation-Equivariant Neural Network Architecture For Auction DesignJad Rahme, Samy Jelassi, Joan Bruna et al.
Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. Theoretical approaches to the problem have hit some limits in the past decades and analytical solutions are known for only a few simple settings. Computational approaches to the problem through the use of LPs have their own set of limitations. Building on the success of deep learning, a new approach was recently proposed by Duetting et al. (2019) in which the auction is modeled by a feed-forward neural network and the design problem is framed as a learning problem. The neural architectures used in that work are general purpose and do not take advantage of any of the symmetries the problem could present, such as permutation equivariance. In this work, we consider auction design problems that have permutation-equivariant symmetry and construct a neural architecture that is capable of perfectly recovering the permutation-equivariant optimal mechanism, which we show is not possible with the previous architecture. We demonstrate that permutation-equivariant architectures are not only capable of recovering previous results, they also have better generalization properties.
GTFeb 26, 2019
Selling a Single Item with Negative ExternalitiesTithi Chattopadhyay, Nick Feamster, Matheus V. X. Ferreira et al.
We consider the problem of regulating products with negative externalities to a third party that is neither the buyer nor the seller, but where both the buyer and seller can take steps to mitigate the externality. The motivating example to have in mind is the sale of Internet-of-Things (IoT) devices, many of which have historically been compromised for DDoS attacks that disrupted Internet-wide services such as Twitter. Neither the buyer (i.e., consumers) nor seller (i.e., IoT manufacturers) was known to suffer from the attack, but both have the power to expend effort to secure their devices. We consider a regulator who regulates payments (via fines if the device is compromised, or market prices directly), or the product directly via mandatory security requirements. Both regulations come at a cost---implementing security requirements increases production costs, and the existence of fines decreases consumers' values---thereby reducing the seller's profits. The focus of this paper is to understand the \emph{efficiency} of various regulatory policies. That is, policy A is more efficient than policy B if A more successfully minimizes negatives externalities, while both A and B reduce seller's profits equally. We develop a simple model to capture the impact of regulatory policies on a buyer's behavior. {In this model, we show that for \textit{homogeneous} markets---where the buyer's ability to follow security practices is always high or always low---the optimal (externality-minimizing for a given profit constraint) regulatory policy need regulate \emph{only} payments \emph{or} production.} In arbitrary markets, by contrast, we show that while the optimal policy may require regulating both aspects, there is always an approximately optimal policy which regulates just one.
CRNov 21, 2018
Bitcoin: A Natural OligopolyNick Arnosti, S. Matthew Weinberg
Although Bitcoin was intended to be a decentralized digital currency, in practice, mining power is quite concentrated. This fact is a persistent source of concern for the Bitcoin community. We provide an explanation using a simple model to capture miners' incentives to invest in equipment. In our model, $n$ miners compete for a prize of fixed size. Each miner chooses an investment $q_i$, incurring cost $c_i q_i$, and then receives reward $\frac{q_i^α}{\sum_j q_j^α}$, for some $α\geq 1$. When $c_i = c_j$ for all $i,j$, and $α= 1$, there is a unique equilibrium where all miners invest equally. However, we prove that under seemingly mild deviations from this model, equilibrium outcomes become drastically more centralized. In particular, (a) When costs are asymmetric, if miner $i$ chooses to invest, then miner $j$ has market share at least $1-\frac{c_j}{c_i}$. That is, if miner $j$ has costs that are (e.g.) $20\%$ lower than those of miner $i$, then miner $j$ must control at least $20\%$ of the \emph{total} mining power. (b) In the presence of economies of scale ($α> 1$), every market participant has a market share of at least $1-\frac{1}α$, implying that the market features at most $\fracα{α- 1}$ miners in total. We discuss the implications of our results for the future design of cryptocurrencies. In particular, our work further motivates the study of protocols that minimize "orphaned" blocks, proof-of-stake protocols, and incentive compatible protocols.
GTSep 18, 2018
Formal Barriers to Longest-Chain Proof-of-Stake ProtocolsJonah Brown-Cohen, Arvind Narayanan, Christos-Alexandros Psomas et al.
The security of most existing cryptocurrencies is based on a concept called Proof-of-Work, in which users must solve a computationally hard cryptopuzzle to authorize transactions (`one unit of computation, one vote'). This leads to enormous expenditure on hardware and electricity in order to collect the rewards associated with transaction authorization. Proof-of-Stake is an alternative concept that instead selects users to authorize transactions proportional to their wealth (`one coin, one vote'). Some aspects of the two paradigms are the same. For instance, obtaining voting power in Proof-of-Stake has a monetary cost just as in Proof-of-Work: a coin cannot be freely duplicated any more easily than a unit of computation. However some aspects are fundamentally different. In particular, exactly because Proof-of-Stake is wasteless, there is no inherent resource cost to deviating (commonly referred to as the `Nothing-at-Stake' problem). In contrast to prior work, we focus on incentive-driven deviations (any participant will deviate if doing so yields higher revenue) instead of adversarial corruption (an adversary may take over a significant fraction of the network, but the remaining players follow the protocol). The main results of this paper are several formal barriers to designing incentive-compatible proof-of-stake cryptocurrencies (that don't apply to proof-of-work).
GTAug 7, 2018
The Sample Complexity of Up-to-$\varepsilon$ Multi-Dimensional Revenue MaximizationYannai A. Gonczarowski, S. Matthew Weinberg
We consider the sample complexity of revenue maximization for multiple bidders in unrestricted multi-dimensional settings. Specifically, we study the standard model of $n$ additive bidders whose values for $m$ heterogeneous items are drawn independently. For any such instance and any $\varepsilon>0$, we show that it is possible to learn an $\varepsilon$-Bayesian Incentive Compatible auction whose expected revenue is within $\varepsilon$ of the optimal $\varepsilon$-BIC auction from only polynomially many samples. Our fully nonparametric approach is based on ideas that hold quite generally, and completely sidestep the difficulty of characterizing optimal (or near-optimal) auctions for these settings. Therefore, our results easily extend to general multi-dimensional settings, including valuations that are not necessarily even subadditive, and arbitrary allocation constraints. For the cases of a single bidder and many goods, or a single parameter (good) and many bidders, our analysis yields exact incentive compatibility (and for the latter also computational efficiency). Although the single-parameter case is already well-understood, our corollary for this case extends slightly the state-of-the-art.
GTNov 25, 2017
Selling to a No-Regret BuyerMark Braverman, Jieming Mao, Jon Schneider et al.
We consider the problem of a single seller repeatedly selling a single item to a single buyer (specifically, the buyer has a value drawn fresh from known distribution $D$ in every round). Prior work assumes that the buyer is fully rational and will perfectly reason about how their bids today affect the seller's decisions tomorrow. In this work we initiate a different direction: the buyer simply runs a no-regret learning algorithm over possible bids. We provide a fairly complete characterization of optimal auctions for the seller in this domain. Specifically: - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), then the seller can extract expected revenue arbitrarily close to the expected welfare. This auction is independent of the buyer's valuation $D$, but somewhat unnatural as it is sometimes in the buyer's interest to overbid. - There exists a learning algorithm $\mathcal{A}$ such that if the buyer bids according to $\mathcal{A}$ then the optimal strategy for the seller is simply to post the Myerson reserve for $D$ every round. - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), but the seller is restricted to "natural" auction formats where overbidding is dominated (e.g. Generalized First-Price or Generalized Second-Price), then the optimal strategy for the seller is a pay-your-bid format with decreasing reserves over time. Moreover, the seller's optimal achievable revenue is characterized by a linear program, and can be unboundedly better than the best truthful auction yet simultaneously unboundedly worse than the expected welfare.
GTJun 27, 2017
Multi-armed Bandit Problems with Strategic ArmsMark Braverman, Jieming Mao, Jon Schneider et al.
We study a strategic version of the multi-armed bandit problem, where each arm is an individual strategic agent and we, the principal, pull one arm each round. When pulled, the arm receives some private reward $v_a$ and can choose an amount $x_a$ to pass on to the principal (keeping $v_a-x_a$ for itself). All non-pulled arms get reward $0$. Each strategic arm tries to maximize its own utility over the course of $T$ rounds. Our goal is to design an algorithm for the principal incentivizing these arms to pass on as much of their private rewards as possible. When private rewards are stochastically drawn each round ($v_a^t \leftarrow D_a$), we show that: - Algorithms that perform well in the classic adversarial multi-armed bandit setting necessarily perform poorly: For all algorithms that guarantee low regret in an adversarial setting, there exist distributions $D_1,\ldots,D_k$ and an approximate Nash equilibrium for the arms where the principal receives reward $o(T)$. - Still, there exists an algorithm for the principal that induces a game among the arms where each arm has a dominant strategy. When each arm plays its dominant strategy, the principal sees expected reward $μ'T - o(T)$, where $μ'$ is the second-largest of the means $\mathbb{E}[D_{a}]$. This algorithm maintains its guarantee if the arms are non-strategic ($x_a = v_a$), and also if there is a mix of strategic and non-strategic arms.