SYSep 29, 2011
Distributed Algorithms for Consensus and Coordination in the Presence of Packet-Dropping Communication Links - Part II: Coefficients of Ergodicity Analysis ApproachNitin H. Vaidya, Christoforos N. Hadjicostis, Alejandro D. Dominguez-Garcia
In this two-part paper, we consider multicomponent systems in which each component can iteratively exchange information with other components in its neighborhood in order to compute, in a distributed fashion, the average of the components' initial values or some other quantity of interest (i.e., some function of these initial values). In particular, we study an iterative algorithm for computing the average of the initial values of the nodes. In this algorithm, each component maintains two sets of variables that are updated via two identical linear iterations. The average of the initial values of the nodes can be asymptotically computed by each node as the ratio of two of the variables it maintains. In the first part of this paper, we show how the update rules for the two sets of variables can be enhanced so that the algorithm becomes tolerant to communication links that may drop packets, independently among them and independently between different transmission times. In this second part, by rewriting the collective dynamics of both iterations, we show that the resulting system is mathematically equivalent to a finite inhomogenous Markov chain whose transition matrix takes one of finitely many values at each step. Then, by using e a coefficients of ergodicity approach, a method commonly used for convergence analysis of Markov chains, we prove convergence of the robustified consensus scheme. The analysis suggests that similar convergence should hold under more general conditions as well.
SYSep 29, 2011
Distributed Algorithms for Consensus and Coordination in the Presence of Packet-Dropping Communication Links - Part I: Statistical Moments Analysis ApproachAlejandro D. Dominguez-Garcia, Christoforos N. Hadjicostis, Nitin H. Vaidya
This two-part paper discusses robustification methodologies for linear-iterative distributed algorithms for consensus and coordination problems in multicomponent systems, in which unreliable communication links may drop packets. We consider a setup where communication links between components can be asymmetric (i.e., component j might be able to send information to component i, but not necessarily vice-versa), so that the information exchange between components in the system is in general described by a directed graph that is assumed to be strongly connected. In the absence of communication link failures, each component i maintains two auxiliary variables and updates each of their values to be a linear combination of their corresponding previous values and the corresponding previous values of neighboring components (i.e., components that send information to node i). By appropriately initializing these two (decoupled) iterations, the system components can asymptotically calculate variables of interest in a distributed fashion; in particular, the average of the initial conditions can be calculated as a function that involves the ratio of these two auxiliary variables. The focus of this paper to robustify this double-iteration algorithm against communication link failures. We achieve this by modifying the double-iteration algorithm (by introducing some additional auxiliary variables) and prove that the modified double-iteration converges almost surely to average consensus. In the first part of the paper, we study the first and second moments of the two iterations, and use them to establish convergence, and illustrate the performance of the algorithm with several numerical examples. In the second part, in order to establish the convergence of the algorithm, we use coefficients of ergodicity commonly used in analyzing inhomogeneous Markov chains.
DCNov 16, 2022
Impact of Redundancy on Resilience in Distributed Optimization and LearningShuo Liu, Nirupam Gupta, Nitin H. Vaidya
This report considers the problem of resilient distributed optimization and stochastic learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has its own local cost function. The agents collaborate with the server to find a minimum of the aggregate of the local cost functions. In the context of stochastic learning, the local cost of an agent is the loss function computed over the data at that agent. In this report, we consider this problem in a system wherein some of the agents may be Byzantine faulty and some of the agents may be slow (also called stragglers). In this setting, we investigate the conditions under which it is possible to obtain an "approximate" solution to the above problem. In particular, we introduce the notion of $(f, r; ε)$-resilience to characterize how well the true solution is approximated in the presence of up to $f$ Byzantine faulty agents, and up to $r$ slow agents (or stragglers) -- smaller $ε$ represents a better approximation. We also introduce a measure named $(f, r; ε)$-redundancy to characterize the redundancy in the cost functions of the agents. Greater redundancy allows for a better approximation when solving the problem of aggregate cost minimization. In this report, we constructively show (both theoretically and empirically) that $(f, r; \mathcal{O}(ε))$-resilience can indeed be achieved in practice, given that the local cost functions are sufficiently redundant.
45.8DCMay 11
Byzantine Consensus in Directed Graphs with Message AuthenticationNitin H. Vaidya, Lewis Tseng
We consider the problem of reaching consensus in communication networks that are modeled by directed graphs. We assume the existence of a message authentication mechanism (such as digital signatures) to verify the integrity of messages. We identify the necessary and sufficient conditions on the directed communication graph for the following problems to be solvable: (i) exact consensus in synchronous systems; and (ii) approximate consensus in asynchronous systems.
LGAug 11, 2020
Byzantine Fault-Tolerant Distributed Machine Learning Using Stochastic Gradient Descent (SGD) and Norm-Based Comparative Gradient Elimination (CGE)Nirupam Gupta, Shuo Liu, Nitin H. Vaidya
This paper considers the Byzantine fault-tolerance problem in distributed stochastic gradient descent (D-SGD) method - a popular algorithm for distributed multi-agent machine learning. In this problem, each agent samples data points independently from a certain data-generating distribution. In the fault-free case, the D-SGD method allows all the agents to learn a mathematical model best fitting the data collectively sampled by all agents. We consider the case when a fraction of agents may be Byzantine faulty. Such faulty agents may not follow a prescribed algorithm correctly, and may render traditional D-SGD method ineffective by sharing arbitrary incorrect stochastic gradients. We propose a norm-based gradient-filter, named comparative gradient elimination (CGE), that robustifies the D-SGD method against Byzantine agents. We show that the CGE gradient-filter guarantees fault-tolerance against a bounded fraction of Byzantine agents under standard stochastic assumptions, and is computationally simpler compared to many existing gradient-filters such as multi-KRUM, geometric median-of-means, and the spectral filters. We empirically show, by simulating distributed learning on neural networks, that the fault-tolerance of CGE is comparable to that of existing gradient-filters. We also empirically show that exponential averaging of stochastic gradients improves the fault-tolerance of a generic gradient-filter.
OCApr 9, 2020
A Private and Finite-Time Algorithm for Solving a Distributed System of Linear EquationsShripad Gade, Ji Liu, Nitin H. Vaidya
This paper studies a system of linear equations, denoted as $Ax = b$, which is horizontally partitioned (rows in $A$ and $b$) and stored over a network of $m$ devices connected in a fixed directed graph. We design a fast distributed algorithm for solving such a partitioned system of linear equations, that additionally, protects the privacy of local data against an honest-but-curious adversary that corrupts at most $τ$ nodes in the network. First, we present TITAN, privaTe fInite Time Average coNsensus algorithm, for solving a general average consensus problem over directed graphs, while protecting statistical privacy of private local data against an honest-but-curious adversary. Second, we propose a distributed linear system solver that involves each agent/devices computing an update based on local private data, followed by private aggregation using TITAN. Finally, we show convergence of our solver to the least squares solution in finite rounds along with statistical privacy of local linear equations against an honest-but-curious adversary provided the graph has weak vertex-connectivity of at least $τ+1$. We perform numerical experiments to validate our claims and compare our solution to the state-of-the-art methods by comparing computation, communication and memory costs.
CRApr 3, 2020
Preserving Statistical Privacy in Distributed OptimizationNirupam Gupta, Shripad Gade, Nikhil Chopra et al.
We present a distributed optimization protocol that preserves statistical privacy of agents' local cost functions against a passive adversary that corrupts some agents in the network. The protocol is a composition of a distributed ``{\em zero-sum}" obfuscation protocol that obfuscates the agents' local cost functions, and a standard non-private distributed optimization method. We show that our protocol protects the statistical privacy of the agents' local cost functions against a passive adversary that corrupts up to $t$ arbitrary agents as long as the communication network has $(t+1)$-vertex connectivity. The ``{\em zero-sum}" obfuscation protocol preserves the sum of the agents' local cost functions and therefore ensures accuracy of the computed solution.
CRFeb 26, 2020
Improved Extension Protocols for Byzantine Broadcast and AgreementKartik Nayak, Ling Ren, Elaine Shi et al.
Byzantine broadcast (BB) and Byzantine agreement (BA) are two most fundamental problems and essential building blocks in distributed computing, and improving their efficiency is of interest to both theoreticians and practitioners. In this paper, we study extension protocols of BB and BA, i.e., protocols that solve BB/BA with long inputs of $l$ bits using lower costs than $l$ single-bit instances. We present new protocols with improved communication complexity in almost all settings: authenticated BA/BB with $t<n/2$, authenticated BB with $t<(1-ε)n$, unauthenticated BA/BB with $t<n/3$, and asynchronous reliable broadcast and BA with $t<n/3$. The new protocols are advantageous and significant in several aspects. First, they achieve the best-possible communication complexity of $Θ(nl)$ for wider ranges of input sizes compared to prior results. Second, the authenticated extension protocols achieve optimal communication complexity given the current best available BB/BA protocols for short messages. Third, to the best of our knowledge, our asynchronous and authenticated protocols in the setting are the first extension protocols in that setting.
DCDec 19, 2019
Randomized Reactive Redundancy for Byzantine Fault-Tolerance in Parallelized LearningNirupam Gupta, Nitin H. Vaidya
This report considers the problem of Byzantine fault-tolerance in synchronous parallelized learning that is founded on the parallelized stochastic gradient descent (parallelized-SGD) algorithm. The system comprises a master, and $n$ workers, where up to $f$ of the workers are Byzantine faulty. Byzantine workers need not follow the master's instructions correctly, and might send malicious incorrect (or faulty) information. The identity of the Byzantine workers remains fixed throughout the learning process, and is unknown a priori to the master. We propose two coding schemes, a deterministic scheme and a randomized scheme, for guaranteeing exact fault-tolerance if $2f < n$. The coding schemes use the concept of reactive redundancy for isolating Byzantine workers that eventually send faulty information. We note that the computation efficiencies of the schemes compare favorably with other (deterministic or randomized) coding schemes, for exact fault-tolerance.
LGMar 20, 2019
Byzantine Fault Tolerant Distributed Linear RegressionNirupam Gupta, Nitin H. Vaidya
This paper considers the problem of Byzantine fault tolerance in distributed linear regression in a multi-agent system. However, the proposed algorithms are given for a more general class of distributed optimization problems, of which distributed linear regression is a special case. The system comprises of a server and multiple agents, where each agent is holding a certain number of data points and responses that satisfy a linear relationship (could be noisy). The objective of the server is to determine this relationship, given that some of the agents in the system (up to a known number) are Byzantine faulty (aka. actively adversarial). We show that the server can achieve this objective, in a deterministic manner, by robustifying the original distributed gradient descent method using norm based filters, namely 'norm filtering' and 'norm-cap filtering', incurring an additional log-linear computation cost in each iteration. The proposed algorithms improve upon the existing methods on three levels: i) no assumptions are required on the probability distribution of data points, ii) system can be partially asynchronous, and iii) the computational overhead (in order to handle Byzantine faulty agents) is log-linear in number of agents and linear in dimension of data points. The proposed algorithms differ from each other in the assumptions made for their correctness, and the gradient filter they use.
DCMar 27, 2017
Private Learning on Networks: Part IIShripad Gade, Nitin H. Vaidya
This paper considers a distributed multi-agent optimization problem, with the global objective consisting of the sum of local objective functions of the agents. The agents solve the optimization problem using local computation and communication between adjacent agents in the network. We present two randomized iterative algorithms for distributed optimization. To improve privacy, our algorithms add "structured" randomization to the information exchanged between the agents. We prove deterministic correctness (in every execution) of the proposed algorithms despite the information being perturbed by noise with non-zero mean. We prove that a special case of a proposed algorithm (called function sharing) preserves privacy of individual polynomial objective functions under a suitable connectivity condition on the network topology.
DCDec 15, 2016
Private Learning on NetworksShripad Gade, Nitin H. Vaidya
Continual data collection and widespread deployment of machine learning algorithms, particularly the distributed variants, have raised new privacy challenges. In a distributed machine learning scenario, the dataset is stored among several machines and they solve a distributed optimization problem to collectively learn the underlying model. We present a secure multi-party computation inspired privacy preserving distributed algorithm for optimizing a convex function consisting of several possibly non-convex functions. Each individual objective function is privately stored with an agent while the agents communicate model parameters with neighbor machines connected in a network. We show that our algorithm can correctly optimize the overall objective function and learn the underlying model accurately. We further prove that under a vertex connectivity condition on the topology, our algorithm preserves privacy of individual objective functions. We establish limits on the what a coalition of adversaries can learn by observing the messages and states shared over a network.
DCAug 18, 2016
Distributed Optimization of Convex Sum of Non-Convex FunctionsShripad Gade, Nitin H. Vaidya
We present a distributed solution to optimizing a convex function composed of several non-convex functions. Each non-convex function is privately stored with an agent while the agents communicate with neighbors to form a network. We show that coupled consensus and projected gradient descent algorithm proposed in [1] can optimize convex sum of non-convex functions under an additional assumption on gradient Lipschitzness. We further discuss the applications of this analysis in improving privacy in distributed optimization.
DCAug 12, 2016
Distributed Optimization for Client-Server Architecture with Negative Gradient WeightsShripad Gade, Nitin H. Vaidya
Availability of both massive datasets and computing resources have made machine learning and predictive analytics extremely pervasive. In this work we present a synchronous algorithm and architecture for distributed optimization motivated by privacy requirements posed by applications in machine learning. We present an algorithm for the recently proposed multi-parameter-server architecture. We consider a group of parameter servers that learn a model based on randomized gradients received from clients. Clients are computational entities with private datasets (inducing a private objective function), that evaluate and upload randomized gradients to the parameter servers. The parameter servers perform model updates based on received gradients and share the model parameters with other servers. We prove that the proposed algorithm can optimize the overall objective function for a very general architecture involving $C$ clients connected to $S$ parameter servers in an arbitrary time varying topology and the parameter servers forming a connected network.
DCJun 28, 2016
Defending Non-Bayesian Learning against Adversarial AttacksLili Su, Nitin H. Vaidya
This paper addresses the problem of non-Bayesian learning over multi-agent networks, where agents repeatedly collect partially informative observations about an unknown state of the world, and try to collaboratively learn the true state. We focus on the impact of the adversarial agents on the performance of consensus-based non-Bayesian learning, where non-faulty agents combine local learning updates with consensus primitives. In particular, we consider the scenario where an unknown subset of agents suffer Byzantine faults -- agents suffering Byzantine faults behave arbitrarily. Two different learning rules are proposed.