OCAug 19, 2013
Geometry of Power Flows and Optimization in Distribution NetworksJavad Lavaei, David Tse, Baosen Zhang · stanford
We investigate the geometry of injection regions and its relationship to optimization of power flows in tree networks. The injection region is the set of all vectors of bus power injections that satisfy the network and operation constraints. The geometrical object of interest is the set of Pareto-optimal points of the injection region. If the voltage magnitudes are fixed, the injection region of a tree network can be written as a linear transformation of the product of two-bus injection regions, one for each line in the network. Using this decomposition, we show that under the practical condition that the angle difference across each line is not too large, the set of Pareto-optimal points of the injection region remains unchanged by taking the convex hull. Moreover, the resulting convexified optimal power flow problem can be efficiently solved via }{ semi-definite programming or second order cone relaxations. These results improve upon earlier works by removing the assumptions on active power lower bounds. It is also shown that our practical angle assumption guarantees two other properties: (i) the uniqueness of the solution of the power flow problem, and (ii) the non-negativity of the locational marginal prices. Partial results are presented for the case when the voltage magnitudes are not fixed but can lie within certain bounds.
OCSep 24, 2011
Distributed Algorithms for Optimal Power Flow ProblemAlbert Y. S. Lam, Baosen Zhang, David Tse · stanford
Optimal power flow (OPF) is an important problem for power generation and it is in general non-convex. With the employment of renewable energy, it will be desirable if OPF can be solved very efficiently so its solution can be used in real time. With some special network structure, e.g. trees, the problem has been shown to have a zero duality gap and the convex dual problem yields the optimal solution. In this paper, we propose a primal and a dual algorithm to coordinate the smaller subproblems decomposed from the convexified OPF. We can arrange the subproblems to be solved sequentially and cumulatively in a central node or solved in parallel in distributed nodes. We test the algorithms on IEEE radial distribution test feeders, some random tree-structured networks, and the IEEE transmission system benchmarks. Simulation results show that the computation time can be improved dramatically with our algorithms over the centralized approach of solving the problem without decomposition, especially in tree-structured problems. The computation time grows linearly with the problem size with the cumulative approach while the distributed one can have size-independent computation time.
OCJul 5, 2012
Geometry of Injection Regions of Power NetworksBaosen Zhang, David Tse · stanford
We investigate the constraints on power flow in networks and its implications to the optimal power flow problem. The constraints are described by the injection region of a network; this is the set of all vectors of power injections, one at each bus, that can be achieved while satisfying the network and operation constraints. If there are no operation constraints, we show the injection region of a network is the set of all injections satisfying the conservation of energy. If the network has a tree topology, e.g., a distribution network, we show that under voltage magnitude, line loss constraints, line flow constraints and certain bus real and reactive power constraints, the injection region and its convex hull have the same Pareto-front. The Pareto-front is of interest since these are the the optimal solutions to the minimization of increasing functions over the injection region. For non-tree networks, we obtain a weaker result by characterize the convex hull of the voltage constraint injection region for lossless cycles and certain combinations of cycles and trees.
OCFeb 7, 2015
An Optimal and Distributed Method for Voltage Regulation in Power Distribution SystemsBaosen Zhang, Albert Y. S. Lam, Alejandro Dominguez-Garcia et al. · stanford
This paper addresses the problem of voltage regulation in power distribution networks with deep-penetration of distributed energy resources, e.g., renewable-based generation, and storage-capable loads such as plug-in hybrid electric vehicles. We cast the problem as an optimization program, where the objective is to minimize the losses in the network subject to constraints on bus voltage magnitudes, limits on active and reactive power injections, transmission line thermal limits and losses. We provide sufficient conditions under which the optimization problem can be solved via its convex relaxation. Using data from existing networks, we show that these sufficient conditions are expected to be satisfied by most networks. We also provide an efficient distributed algorithm to solve the problem. The algorithm adheres to a communication topology described by a graph that is the same as the graph that describes the electrical network topology. We illustrate the operation of the algorithm, including its robustness against communication link failures, through several case studies involving 5-, 34-, and 123-bus power distribution systems.
SYOct 20, 2011
Distributed Storage for Intermittent Energy Sources: Control Design and Performance LimitsYashodhan Kanoria, Andrea Montanari, David Tse et al. · stanford
One of the most important challenges in the integration of renewable energy sources into the power grid lies in their `intermittent' nature. The power output of sources like wind and solar varies with time and location due to factors that cannot be controlled by the provider. Two strategies have been proposed to hedge against this variability: 1) use energy storage systems to effectively average the produced power over time; 2) exploit distributed generation to effectively average production over location. We introduce a network model to study the optimal use of storage and transmission resources in the presence of random energy sources. We propose a Linear-Quadratic based methodology to design control strategies, and we show that these strategies are asymptotically optimal for some simple network topologies. For these topologies, the dependence of optimal performance on storage and transmission capacity is explicitly quantified.
OCApr 22, 2014
Network Risk Limiting Dispatch: Optimal Control and Price of UncertaintyBaosen Zhang, Ram Rajagopal, David Tse · stanford
Increased uncertainty due to high penetration of renewables imposes significant costs to the system operators. The added costs depend on several factors including market design, performance of renewable generation forecasting and the specific dispatch procedure. Quantifying these costs has been limited to small sample Monte Carlo approaches applied specific dispatch algorithms. The computational complexity and accuracy of these approaches has limited the understanding of tradeoffs between different factors. {In this work we consider a two-stage stochastic economic dispatch problem. Our goal is to provide an analytical quantification and an intuitive understanding of the effects of uncertainties and network congestion on the dispatch procedure and the optimal cost.} We first consider an uncongested network and calculate the risk limiting dispatch. In addition, we derive the price of uncertainty, a number that characterizes the intrinsic impact of uncertainty on the integration cost of renewables. Then we extend the results to a network where one link can become congested. Under mild conditions, we calculate price of uncertainty even in this case. We show that risk limiting dispatch is given by a set of deterministic equilibrium equations. The dispatch solution yields an important insight: congested links do not create isolated nodes, even in a two-node network. In fact, the network can support backflows in congested links, that are useful to reduce the uncertainty by averaging supply across the network. We demonstrate the performance of our approach in standard IEEE benchmark networks.
LGMar 18, 2022
Approximate Function Evaluation via Multi-Armed BanditsTavor Z. Baharav, Gary Cheng, Mert Pilanci et al. · stanford
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $\boldsymbolμ \in \mathbb{R}^n$, where each component $μ_i$ can be sampled via a noisy oracle. Sampling more frequently components of $\boldsymbolμ$ corresponding to directions of the function with larger directional derivatives is more sample-efficient. However, as $\boldsymbolμ$ is unknown, the optimal sampling frequencies are also unknown. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-δ$ returns an $ε$ accurate estimate of $f(\boldsymbolμ)$. We generalize our algorithm to adapt to heteroskedastic noise, and prove asymptotic optimality when $f$ is linear. We corroborate our theoretical results with numerical experiments, showing the dramatic gains afforded by adaptivity.
LGNov 1, 2022
Beyond the Best: Estimating Distribution Functionals in Infinite-Armed BanditsYifei Wang, Tavor Baharav, Yanjun Han et al. · stanford
In the infinite-armed bandit problem, each arm's average reward is sampled from an unknown distribution, and each arm can be sampled further to obtain noisy estimates of the average reward of that arm. Prior work focuses on identifying the best arm, i.e., estimating the maximum of the average reward distribution. We consider a general class of distribution functionals beyond the maximum, and propose unified meta algorithms for both the offline and online settings, achieving optimal sample complexities. We show that online estimation, where the learner can sequentially choose whether to sample a new or existing arm, offers no advantage over the offline setting for estimating the mean functional, but significantly reduces the sample complexity for other functionals such as the median, maximum, and trimmed mean. The matching lower bounds utilize several different Wasserstein distances. For the special case of median estimation, we identify a curious thresholding phenomenon on the indistinguishability between Gaussian convolutions with respect to the noise level, which may be of independent interest.
39.6CRMay 18
Bitcoin StakingXinshu Dong, Orfeas Stefanos Thyfronitis Litos, Ertem Nusret Tas et al.
The idea of security sharing goes back to Nakamoto's introduction of merge mining, a technique that enables Bitcoin miners to reuse their hash power to bootstrap and secure other Proof-of-Work (PoW) blockchains. However, with the rise of Proof-of-Stake (PoS) chains, there is a need for new methods of Bitcoin security sharing. We introduce Bitcoin staking, a protocol that allows Bitcoin holders to trustlessly use their idle asset to secure a PoS chain. The key challenge is to enable automatic slashing of bitcoins on the Bitcoin chain upon safety violations on the PoS chain. We achieve this using double-authentication-preventing signatures, finality gadgets and bi-directional timestamping between Bitcoin and the PoS chain. Our design is entirely modular and can be integrated with any PoS chain. A version of this protocol was deployed to secure the Babylon mainnet in April 2025 and currently has over 58,000 bitcoins staked (about 4 billion USD at current prices) while paying only 0.05% APR reward to the stakers. This is 2 orders of magnitude cheaper security cost than in PoS chains secured by their native token.
LGNov 19, 2018Code
Generalizable Adversarial Training via Spectral NormalizationFarzan Farnia, Jesse M. Zhang, David Tse
Deep neural networks (DNNs) have set benchmarks on a wide array of supervised learning tasks. Trained DNNs, however, often lack robustness to minor adversarial perturbations to the input, which undermines their true practicality. Recent works have increased the robustness of DNNs by fitting networks using adversarially-perturbed training samples, but the improved performance can still be far below the performance seen in non-adversarial settings. A significant portion of this gap can be attributed to the decrease in generalization performance due to adversarial training. In this work, we extend the notion of margin loss to adversarial settings and bound the generalization error for DNNs trained under several well-known gradient-based attack schemes, motivating an effective regularization scheme based on spectral normalization of the DNN's weight matrices. We also provide a computationally-efficient method for normalizing the spectral norm of convolutional layers with arbitrary stride and padding schemes in deep convolutional networks. We evaluate the power of spectral normalization extensively on combinations of datasets, network architectures, and adversarial training schemes. The code is available at https://github.com/jessemzhang/dl_spectral_normalization.
MLNov 2, 2017Code
Medoids in almost linear time via multi-armed banditsVivek Bagaria, Govinda M. Kamath, Vasilis Ntranos et al.
Computing the medoid of a large number of points in high-dimensional space is an increasingly common operation in many data science problems. We present an algorithm Med-dit which uses O(n log n) distance evaluations to compute the medoid with high probability. Med-dit is based on a connection with the multi-armed bandit problem. We evaluate the performance of Med-dit empirically on the Netflix-prize and the single-cell RNA-Seq datasets, containing hundreds of thousands of points living in tens of thousands of dimensions, and observe a 5-10x improvement in performance over the current state of the art. Med-dit is available at https://github.com/bagavi/Meddit
CRJan 20, 2022
Babylon: Reusing Bitcoin Mining to Enhance Proof-of-Stake SecurityErtem Nusret Tas, David Tse, Fisher Yu et al.
Bitcoin is the most secure blockchain in the world, supported by the immense hash power of its Proof-of-Work miners, but consumes huge amount of energy. Proof-of-Stake chains are energy-efficient, have fast finality and accountability, but face several fundamental security issues: susceptibility to non-slashable long-range safety attacks, non-slashable transaction censorship and stalling attacks and difficulty to bootstrap new PoS chains from low token valuation. We propose Babylon, a blockchain platform which combines the best of both worlds by reusing the immense Bitcoin hash power to enhance the security of PoS chains. Babylon provides a data-available timestamping service, securing PoS chains by allowing them to timestamp data-available block checkpoints, fraud proofs and censored transactions on Babylon. Babylon miners merge mine with Bitcoin and thus the platform has zero additional energy cost. The security of a Babylon-enhanced PoS protocol is formalized by a cryptoeconomic security theorem which shows slashable safety and liveness guarantees.
CRNov 24, 2021
Longest Chain Consensus Under Bandwidth ConstraintJoachim Neu, Srivatsan Sridhar, Lei Yang et al.
Spamming attacks are a serious concern for consensus protocols, as witnessed by recent outages of a major blockchain, Solana. They cause congestion and excessive message delays in a real network due to its bandwidth constraints. In contrast, longest chain (LC), an important family of consensus protocols, has previously only been proven secure assuming an idealized network model in which all messages are delivered within bounded delay. This model-reality mismatch is further aggravated for Proof-of-Stake (PoS) LC where the adversary can spam the network with equivocating blocks. Hence, we extend the network model to capture bandwidth constraints, under which nodes now need to choose carefully which blocks to spend their limited download budget on. To illustrate this point, we show that 'download along the longest header chain', a natural download rule for Proof-of-Work (PoW) LC, is insecure for PoS LC. We propose a simple rule 'download towards the freshest block', formalize two common heuristics 'not downloading equivocations' and 'blocklisting', and prove in a unified framework that PoS LC with any one of these download rules is secure in bandwidth-constrained networks. In experiments, we validate our claims and showcase the behavior of these download rules under attack. By composing multiple instances of a PoS LC protocol with a suitable download rule in parallel, we obtain a PoS consensus protocol that achieves a constant fraction of the network's throughput limit even under worst-case adversarial strategies.
CRNov 24, 2021
Information Dispersal with Provable Retrievability for RollupsKamilla Nazirkhanova, Joachim Neu, David Tse
The ability to verifiably retrieve transaction or state data stored off-chain is crucial to blockchain scaling techniques such as rollups or sharding. We formalize the problem and design a storage- and communication-efficient protocol using linear erasure-correcting codes and homomorphic vector commitments. Motivated by application requirements for rollups, our solution Semi-AVID-PR departs from earlier Verifiable Information Dispersal schemes in that we do not require comprehensive termination properties. Compared to Data Availability Oracles, under no circumstance do we fall back to returning empty blocks. Distributing a file of 22 MB among 256 storage nodes, up to 85 of which may be adversarial, requires in total ~70 MB of communication and storage, and ~41 seconds of single-thread runtime (<3 seconds on 16 threads) on an AMD Opteron 6378 processor when using the BLS12-381 curve. Our solution requires no modification to on-chain contracts of Validium rollups such as StarkWare's StarkEx. Additionally, it provides privacy of the dispersed data against honest-but-curious storage nodes. Finally, we discuss an application of our Semi-AVID-PR scheme to data availability verification schemes based on random sampling.
CROct 19, 2021
Three Attacks on Proof-of-Stake EthereumCaspar Schwarz-Schilling, Joachim Neu, Barnabé Monnot et al.
Recently, two attacks were presented against Proof-of-Stake (PoS) Ethereum: one where short-range reorganizations of the underlying consensus chain are used to increase individual validators' profits and delay consensus decisions, and one where adversarial network delay is leveraged to stall consensus decisions indefinitely. We provide refined variants of these attacks, considerably relaxing the requirements on adversarial stake and network timing, and thus rendering the attacks more severe. Combining techniques from both refined attacks, we obtain a third attack which allows an adversary with vanishingly small fraction of stake and no control over network message propagation (assuming instead probabilistic message propagation) to cause even long-range consensus chain reorganizations. Honest-but-rational or ideologically motivated validators could use this attack to increase their profits or stall the protocol, threatening incentive alignment and security of PoS Ethereum. The attack can also lead to destabilization of consensus from congestion in vote processing.
LGJun 18, 2021
Group-Structured Adversarial TrainingFarzan Farnia, Amirali Aghazadeh, James Zou et al.
Robust training methods against perturbations to the input data have received great attention in the machine learning literature. A standard approach in this direction is adversarial training which learns a model using adversarially-perturbed training samples. However, adversarial training performs suboptimally against perturbations structured across samples such as universal and group-sparse shifts that are commonly present in biological data such as gene expression levels of different tissues. In this work, we seek to close this optimality gap and introduce Group-Structured Adversarial Training (GSAT) which learns a model robust to perturbations structured across samples. We formulate GSAT as a non-convex concave minimax optimization problem which minimizes a group-structured optimal transport cost. Specifically, we focus on the applications of GSAT for group-sparse and rank-constrained perturbations modeled using group and nuclear norm penalties. In order to solve GSAT's non-smooth optimization problem in those cases, we propose a new minimax optimization algorithm called GDADMM by combining Gradient Descent Ascent (GDA) and Alternating Direction Method of Multipliers (ADMM). We present several applications of the GSAT framework to gain robustness against structured perturbations for image recognition and computational biology datasets.
CRMay 13, 2021
The Availability-Accountability Dilemma and its Resolution via Accountability GadgetsJoachim Neu, Ertem Nusret Tas, David Tse
For applications of Byzantine fault tolerant (BFT) consensus protocols where the participants are economic agents, recent works highlighted the importance of accountability: the ability to identify participants who provably violate the protocol. At the same time, being able to reach consensus under dynamic levels of participation is desirable for censorship resistance. We identify an availability-accountability dilemma: in an environment with dynamic participation, no protocol can simultaneously be accountably-safe and live. We provide a resolution to this dilemma by constructing a provably secure optimally-resilient accountability gadget to checkpoint a longest chain protocol, such that the full ledger is live under dynamic participation and the checkpointed prefix ledger is accountable. Our accountability gadget construction is black-box and can use any BFT protocol which is accountable under static participation. Using HotStuff as the black box, we implemented our construction as a protocol for the Ethereum 2.0 beacon chain, and our Internet-scale experiments with more than 4000 nodes show that the protocol achieves the required scalability and has better latency than the current solution Gasper, which was shown insecure by recent attacks.
CRNov 22, 2020
TaiJi: Longest Chain Availability with BFT Fast ConfirmationSongze Li, David Tse
Most state machine replication protocols are either based on the 40-years-old Byzantine Fault Tolerance (BFT) theory or the more recent Nakamoto's longest chain design. Longest chain protocols, designed originally in the Proof-of-Work (PoW) setting, are available under dynamic participation, but has probabilistic confirmation with long latency dependent on the security parameter. BFT protocols, designed for the permissioned setting, has fast deterministic confirmation, but assume a fixed number of nodes always online. We present a new construction which combines a longest chain protocol and a BFT protocol to get the best of both worlds. Using this construction, we design TaiJi, the first dynamically available PoW protocol which has almost deterministic confirmation with latency independent of the security parameter. In contrast to previous hybrid approaches which use a single longest chain to sample participants to run a BFT protocol, our native PoW construction uses many independent longest chains to sample propose actions and vote actions for the BFT protocol. This design enables TaiJi to inherit the full dynamic availability of Bitcoin, as well as its full unpredictability, making it secure against fully-adaptive adversaries with up to 50% of online hash power.
CROct 20, 2020
Snap-and-Chat Protocols: System AspectsJoachim Neu, Ertem Nusret Tas, David Tse
The availability-finality dilemma says that blockchain protocols cannot be both available under dynamic participation and safe under network partition. Snap-and-chat protocols have recently been proposed as a resolution to this dilemma. A snap-and-chat protocol produces an always available ledger containing a finalized prefix ledger which is always safe and catches up with the available ledger whenever network conditions permit. In contrast to existing handcrafted finality gadget based designs like Ethereum 2.0's consensus protocol Gasper, snap-and-chat protocols are constructed as a black-box composition of off-the-shelf BFT and longest chain protocols. In this paper, we consider system aspects of snap-and-chat protocols and show how they can provide two important features: 1) accountability, 2) support of light clients. Through this investigation, a deeper understanding of the strengths and challenges of snap-and-chat protocols is gained.
CROct 15, 2020
PoSAT: Proof-of-Work Availability and Unpredictability, without the WorkSoubhik Deb, Sreeram Kannan, David Tse
An important feature of Proof-of-Work (PoW) blockchains is full dynamic availability, allowing miners to go online and offline while requiring only 50% of the online miners to be honest. Existing Proof-of-stake (PoS), Proof-of-Space and related protocols are able to achieve this property only partially, either putting the additional assumption that adversary nodes to be online from the beginning and no new adversary nodes come online afterwards, or use additional trust assumptions for newly joining nodes.We propose a new PoS protocol PoSAT which can provably achieve dynamic availability fully without any additional assumptions. The protocol is based on the longest chain and uses a Verifiable Delay Function for the block proposal lottery to provide an arrow of time. The security analysis of the protocol draws on the recently proposed technique of Nakamoto blocks as well as the theory of branching random walks. An additional feature of PoSAT is the complete unpredictability of who will get to propose a block next, even by the winner itself. This unpredictability is at the same level of PoW protocols, and is stronger than that of existing PoS protocols using Verifiable Random Functions.
CRSep 10, 2020
Ebb-and-Flow Protocols: A Resolution of the Availability-Finality DilemmaJoachim Neu, Ertem Nusret Tas, David Tse
The CAP theorem says that no blockchain can be live under dynamic participation and safe under temporary network partitions. To resolve this availability-finality dilemma, we formulate a new class of flexible consensus protocols, ebb-and-flow protocols, which support a full dynamically available ledger in conjunction with a finalized prefix ledger. The finalized ledger falls behind the full ledger when the network partitions but catches up when the network heals. Gasper, the current candidate protocol for Ethereum 2.0's beacon chain, combines the finality gadget Casper FFG with the LMD GHOST fork choice rule and aims to achieve this property. However, we discovered an attack in the standard synchronous network model, highlighting a general difficulty with existing finality-gadget-based designs. We present a construction of provably secure ebb-and-flow protocols with optimal resilience. Nodes run an off-the-shelf dynamically available protocol, take snapshots of the growing available ledger, and input them into a separate off-the-shelf BFT protocol to finalize a prefix. We explore connections with flexible BFT and improve upon the state-of-the-art for that problem.
CRMay 21, 2020
Everything is a Race and Nakamoto Always WinsAmir Dembo, Sreeram Kannan, Ertem Nusret Tas et al.
Nakamoto invented the longest chain protocol, and claimed its security by analyzing the private double-spend attack, a race between the adversary and the honest nodes to grow a longer chain. But is it the worst attack? We answer the question in the affirmative for three classes of longest chain protocols, designed for different consensus models: 1) Nakamoto's original Proof-of-Work protocol; 2) Ouroboros and SnowWhite Proof-of-Stake protocols; 3) Chia Proof-of-Space protocol. As a consequence, exact characterization of the maximum tolerable adversary power is obtained for each protocol as a function of the average block time normalized by the network delay. The security analysis of these protocols is performed in a unified manner by a novel method of reducing all attacks to a race between the adversary and the honest nodes.
CRMay 19, 2020
Free2Shard: Adaptive-adversary-resistant sharding via Dynamic Self AllocationRanvir Rana, Sreeram Kannan, David Tse et al.
Propelled by the growth of large-scale blockchain deployments, much recent progress has been made in designing sharding protocols that achieve throughput scaling linearly in the number of nodes. However, existing protocols are not robust to an adversary adaptively corrupting a fixed fraction of nodes. In this paper, we propose Free2Shard -- a new architecture that achieves near-linear scaling while being secure against a fully adaptive adversary. The focal point of this architecture is a dynamic self-allocation algorithm that lets users allocate themselves to shards in response to adversarial action, without requiring a central or cryptographic proof. This architecture has several attractive features unusual for sharding protocols, including: (a) the ability to handle the regime of large number of shards (relative to the number of nodes); (b) heterogeneous shard demands; (c) requiring only a small minority to follow the self-allocation; (d) asynchronous shard rotation; (e) operation in a purely identity-free proof-of-work setting. The key technical contribution is a deep mathematical connection to the classical work of Blackwell in dynamic game theory.
CROct 5, 2019
Proof-of-Stake Longest Chain Protocols: Security vs PredictabilityVivek Bagaria, Amir Dembo, Sreeram Kannan et al.
The Nakamoto longest chain protocol is remarkably simple and has been proven to provide security against any adversary with less than 50% of the total hashing power. Proof-of-stake (PoS) protocols are an energy efficient alternative; however existing protocols adopting Nakamoto's longest chain design achieve provable security only by allowing long-term predictability (which have serious security implications). In this paper, we prove that a natural longest chain PoS protocol with similar predictability as Nakamoto's PoW protocol can achieve security against any adversary with less than 1/(1+e) fraction of the total stake. Moreover we propose a new family of longest chain PoS protocols that achieve security against a 50% adversary, while only requiring short-term predictability. Our proofs present a new approach to analyzing the formal security of blockchains, based on a notion of adversary-proof convergence.
CROct 4, 2019
Boomerang: Redundancy Improves Latency and Throughput in Payment-Channel NetworksVivek Bagaria, Joachim Neu, David Tse
In multi-path routing schemes for payment-channel networks, Alice transfers funds to Bob by splitting them into partial payments and routing them along multiple paths. Undisclosed channel balances and mismatched transaction fees cause delays and failures on some payment paths. For atomic transfer schemes, these straggling paths stall the whole transfer. We show that the latency of transfers reduces when redundant payment paths are added. This frees up liquidity in payment channels and hence increases the throughput of the network. We devise Boomerang, a generic technique to be used on top of multi-path routing schemes to construct redundant payment paths free of counterparty risk. In our experiments, applying Boomerang to a baseline routing scheme leads to 40% latency reduction and 2x throughput increase. We build on ideas from publicly verifiable secret sharing, such that Alice learns a secret of Bob iff Bob overdraws funds from the redundant paths. Funds are forwarded using Boomerang contracts, which allow Alice to revert the transfer iff she has learned Bob's secret. We implement the Boomerang contract in Bitcoin Script.
DCSep 25, 2019
Practical Low Latency Proof of Work ConsensusLei Yang, Xuechao Wang, Vivek Bagaria et al.
Bitcoin is the first fully-decentralized permissionless blockchain protocol to achieve a high level of security, but at the expense of poor throughput and latency. Scaling the performance of Bitcoin has a been a major recent direction of research. One successful direction of work has involved replacing proof of work (PoW) by proof of stake (PoS). Proposals to scale the performance in the PoW setting itself have focused mostly on parallelizing the mining process, scaling throughput; the few proposals to improve latency have either sacrificed throughput or the latency guarantees involve large constants rendering it practically useless. Our first contribution is to design a new PoW blockchain Prism++ that has provably low latency and high throughput; the design retains the parallel-chain approach espoused in Prism but invents a new confirmation rule to infer the permanency of a block by combining information across the parallel chains. We show security at the level of Bitcoin with very small confirmation latency (a small constant factor of block interarrival time). A key aspect to scaling the performance is to use a large number of parallel chains, which puts significant strain on the system. Our second contribution is the design and evaluation of a practical system to efficiently manage the memory, computation, and I/O imperatives of a large number of parallel chains. Our implementation of Prism++ achieves a throughput of over 80,000 transactions per second and confirmation latency of tens of seconds on networks of up to 900 EC2 Virtual Machines.
LGJan 27, 2019
Deconstructing Generative Adversarial NetworksBanghua Zhu, Jiantao Jiao, David Tse
We deconstruct the performance of GANs into three components: 1. Formulation: we propose a perturbation view of the population target of GANs. Building on this interpretation, we show that GANs can be viewed as a generalization of the robust statistics framework, and propose a novel GAN architecture, termed as Cascade GANs, to provably recover meaningful low-dimensional generator approximations when the real distribution is high-dimensional and corrupted by outliers. 2. Generalization: given a population target of GANs, we design a systematic principle, projection under admissible distance, to design GANs to meet the population requirement using finite samples. We implement our principle in three cases to achieve polynomial and sometimes near-optimal sample complexities: (1) learning an arbitrary generator under an arbitrary pseudonorm; (2) learning a Gaussian location family under TV distance, where we utilize our principle provide a new proof for the optimality of Tukey median viewed as GANs; (3) learning a low-dimensional Gaussian approximation of a high-dimensional arbitrary distribution under Wasserstein distance. We demonstrate a fundamental trade-off in the approximation error and statistical error in GANs, and show how to apply our principle with empirical samples to predict how many samples are sufficient for GANs in order not to suffer from the discriminator winning problem. 3. Optimization: we demonstrate alternating gradient descent is provably not locally asymptotically stable in optimizing the GAN formulation of PCA. We diagnose the problem as the minimax duality gap being non-zero, and propose a new GAN architecture whose duality gap is zero, where the value of the game is equal to the previous minimax value (not the maximin value). We prove the new GAN architecture is globally asymptotically stable in optimization under alternating gradient descent.
LGOct 28, 2018
A Convex Duality Framework for GANsFarzan Farnia, David Tse
Generative adversarial network (GAN) is a minimax game between a generator mimicking the true model and a discriminator distinguishing the samples produced by the generator from the real training samples. Given an unconstrained discriminator able to approximate any function, this game reduces to finding the generative model minimizing a divergence measure, e.g. the Jensen-Shannon (JS) divergence, to the data distribution. However, in practice the discriminator is constrained to be in a smaller class $\mathcal{F}$ such as neural nets. Then, a natural question is how the divergence minimization interpretation changes as we constrain $\mathcal{F}$. In this work, we address this question by developing a convex duality framework for analyzing GANs. For a convex set $\mathcal{F}$, this duality framework interprets the original GAN formulation as finding the generative model with minimum JS-divergence to the distributions penalized to match the moments of the data distribution, with the moments specified by the discriminators in $\mathcal{F}$. We show that this interpretation more generally holds for f-GAN and Wasserstein GAN. As a byproduct, we apply the duality framework to a hybrid of f-divergence and Wasserstein distance. Unlike the f-divergence, we prove that the proposed hybrid divergence changes continuously with the generative model, which suggests regularizing the discriminator's Lipschitz constant in f-GAN and vanilla GAN. We numerically evaluate the power of the suggested regularization schemes for improving GAN's training performance.
CROct 18, 2018
Deconstructing the Blockchain to Approach Physical LimitsVivek Bagaria, Sreeram Kannan, David Tse et al.
Transaction throughput, confirmation latency and confirmation reliability are fundamental performance measures of any blockchain system in addition to its security. In a decentralized setting, these measures are limited by two underlying physical network attributes: communication capacity and speed-of-light propagation delay. Existing systems operate far away from these physical limits. In this work we introduce Prism, a new proof-of-work blockchain protocol, which can achieve 1) security against up to 50% adversarial hashing power; 2) optimal throughput up to the capacity C of the network; 3) confirmation latency for honest transactions proportional to the propagation delay D, with confirmation error probability exponentially small in CD ; 4) eventual total ordering of all transactions. Our approach to the design of this protocol is based on deconstructing the blockchain into its basic functionalities and systematically scaling up these functionalities to approach their physical limits.
DMApr 15, 2018
Hidden Hamiltonian Cycle Recovery via Linear ProgrammingVivek Bagaria, Jian Ding, David Tse et al.
We introduce the problem of hidden Hamiltonian cycle recovery, where there is an unknown Hamiltonian cycle in an $n$-vertex complete graph that needs to be inferred from noisy edge measurements. The measurements are independent and distributed according to $\calP_n$ for edges in the cycle and $\calQ_n$ otherwise. This formulation is motivated by a problem in genome assembly, where the goal is to order a set of contigs (genome subsequences) according to their positions on the genome using long-range linking measurements between the contigs. Computing the maximum likelihood estimate in this model reduces to a Traveling Salesman Problem (TSP). Despite the NP-hardness of TSP, we show that a simple linear programming (LP) relaxation, namely the fractional $2$-factor (F2F) LP, recovers the hidden Hamiltonian cycle with high probability as $n \to \infty$ provided that $α_n - \log n \to \infty$, where $α_n \triangleq -2 \log \int \sqrt{d P_n d Q_n}$ is the Rényi divergence of order $\frac{1}{2}$. This condition is information-theoretically optimal in the sense that, under mild distributional assumptions, $α_n \geq (1+o(1)) \log n$ is necessary for any algorithm to succeed regardless of the computational cost. Departing from the usual proof techniques based on dual witness construction, the analysis relies on the combinatorial characterization (in particular, the half-integrality) of the extreme points of the F2F polytope. Represented as bicolored multi-graphs, these extreme points are further decomposed into simpler "blossom-type" structures for the large deviation analysis and counting arguments. Evaluation of the algorithm on real data shows improvements over existing approaches.
MENov 3, 2017
NeuralFDR: Learning Discovery Thresholds from Hypothesis FeaturesFei Xia, Martin J. Zhang, James Zou et al.
As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical or uni-variate. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.
MLOct 30, 2017
Understanding GANs: the LQG SettingSoheil Feizi, Farzan Farnia, Tony Ginart et al.
Generative Adversarial Networks (GANs) have become a popular method to learn a probability model from data. In this paper, we aim to provide an understanding of some of the basic issues surrounding GANs including their formulation, generalization and stability on a simple benchmark where the data has a high-dimensional Gaussian distribution. Even in this simple benchmark, the GAN problem has not been well-understood as we observe that existing state-of-the-art GAN architectures may fail to learn a proper generative distribution owing to (1) stability issues (i.e., convergence to bad local solutions or not converging at all), (2) approximation issues (i.e., having improper global GAN optimizers caused by inappropriate GAN's loss functions), and (3) generalizability issues (i.e., requiring large number of samples for training). In this setup, we propose a GAN architecture which recovers the maximum-likelihood solution and demonstrates fast generalization. Moreover, we analyze global stability of different computational approaches for the proposed GAN optimization and highlight their pros and cons. Finally, we outline an extension of our model-based approach to design GANs in more complex setups than the considered Gaussian benchmark.
MLOct 5, 2017
Porcupine Neural Networks: (Almost) All Local Optima are GlobalSoheil Feizi, Hamid Javadi, Jesse Zhang et al.
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.
LGApr 28, 2017
Time-Sensitive Bandit Learning and Satisficing Thompson SamplingDaniel Russo, David Tse, Benjamin Van Roy
The literature on bandit learning and regret analysis has focused on contexts where the goal is to converge on an optimal action in a manner that limits exploration costs. One shortcoming imposed by this orientation is that it does not treat time preference in a coherent manner. Time preference plays an important role when the optimal action is costly to learn relative to near-optimal actions. This limitation has not only restricted the relevance of theoretical results but has also influenced the design of algorithms. Indeed, popular approaches such as Thompson sampling and UCB can fare poorly in such situations. In this paper, we consider discounted rather than cumulative regret, where a discount factor encodes time preference. We propose satisficing Thompson sampling -- a variation of Thompson sampling -- and establish a strong discounted regret bound for this new algorithm.
MLFeb 17, 2017
Maximally Correlated Principal Component AnalysisSoheil Feizi, David Tse
In the era of big data, reducing data dimensionality is critical in many areas of science. Widely used Principal Component Analysis (PCA) addresses this problem by computing a low dimensional data embedding that maximally explain variance of the data. However, PCA has two major weaknesses. Firstly, it only considers linear correlations among variables (features), and secondly it is not suitable for categorical data. We resolve these issues by proposing Maximally Correlated Principal Component Analysis (MCPCA). MCPCA computes transformations of variables whose covariance matrix has the largest Ky Fan norm. Variable transformations are unknown, can be nonlinear and are computed in an optimization. MCPCA can also be viewed as a multivariate extension of Maximal Correlation. For jointly Gaussian variables we show that the covariance matrix corresponding to the identity (or the negative of the identity) transformations majorizes covariance matrices of non-identity functions. Using this result we characterize global MCPCA optimizers for nonlinear functions of jointly Gaussian variables for every rank constraint. For categorical variables we characterize global MCPCA optimizers for the rank one constraint based on the leading eigenvector of a matrix computed using pairwise joint distributions. For a general rank constraint we propose a block coordinate descend algorithm and show its convergence to stationary points of the MCPCA optimization. We compare MCPCA with PCA and other state-of-the-art dimensionality reduction methods including Isomap, LLE, multilayer autoencoders (neural networks), kernel PCA, probabilistic PCA and diffusion maps on several synthetic and real datasets. We show that MCPCA consistently provides improved performance compared to other methods.
MLJun 7, 2016
A Minimax Approach to Supervised LearningFarzan Farnia, David Tse
Given a task of predicting $Y$ from $X$, a loss function $L$, and a set of probability distributions $Γ$ on $(X,Y)$, what is the optimal decision rule minimizing the worst-case expected loss over $Γ$? In this paper, we address this question by introducing a generalization of the principle of maximum entropy. Applying this principle to sets of distributions with marginal on $X$ constrained to be the empirical marginal from the data, we develop a general minimax approach for supervised learning problems. While for some loss functions such as squared-error and log loss, the minimax approach rederives well-knwon regression models, for the 0-1 loss it results in a new linear classifier which we call the maximum entropy machine. The maximum entropy machine minimizes the worst-case 0-1 loss over the structured set of distribution, and by our numerical experiments can outperform other well-known linear classifiers such as SVM. We also prove a bound on the generalization worst-case error in the minimax approach.
ITFeb 11, 2016
Community Recovery in Graphs with LocalityYuxin Chen, Govinda Kamath, Changho Suh et al.
Motivated by applications in domains such as social networks and computational biology, we study the problem of community recovery in graphs with locality. In this problem, pairwise noisy measurements of whether two nodes are in the same community or different communities come mainly or exclusively from nearby nodes rather than uniformly sampled between all nodes pairs, as in most existing models. We present an algorithm that runs nearly linearly in the number of measurements and which achieves the information theoretic limit for exact recovery.
LGNov 5, 2015
Discrete Rényi ClassifiersMeisam Razaviyayn, Farzan Farnia, David Tse
Consider the binary classification problem of predicting a target variable $Y$ from a discrete feature vector $X = (X_1,...,X_d)$. When the probability distribution $\mathbb{P}(X,Y)$ is known, the optimal classifier, leading to the minimum misclassification rate, is given by the Maximum A-posteriori Probability decision rule. However, estimating the complete joint distribution $\mathbb{P}(X,Y)$ is computationally and statistically impossible for large values of $d$. An alternative approach is to first estimate some low order marginals of $\mathbb{P}(X,Y)$ and then design the classifier based on the estimated low order marginals. This approach is also helpful when the complete training data instances are not available due to privacy concerns. In this work, we consider the problem of finding the optimum classifier based on some estimated low order marginals of $(X,Y)$. We prove that for a given set of marginals, the minimum Hirschfeld-Gebelein-Renyi (HGR) correlation principle introduced in [1] leads to a randomized classification rule which is shown to have a misclassification rate no larger than twice the misclassification rate of the optimal classifier. Then, under a separability condition, we show that the proposed algorithm is equivalent to a randomized linear regression approach. In addition, this method naturally results in a robust feature selection method selecting a subset of features having the maximum worst case HGR correlation with the target variable. Our theoretical upper-bound is similar to the recent Discrete Chebyshev Classifier (DCC) approach [2], while the proposed algorithm has significant computational advantages since it only requires solving a least square optimization problem. Finally, we numerically compare our proposed algorithm with the DCC classifier and show that the proposed algorithm results in better misclassification rate over various datasets.