OCMay 31, 2018Code
Accelerating Incremental Gradient Optimization with Curvature InformationHoi-To Wai, Wei Shi, Cesar A. Uribe et al.
This paper studies an acceleration technique for incremental aggregated gradient ({\sf IAG}) method through the use of \emph{curvature} information for solving strongly convex finite sum optimization problems. These optimization problems of interest arise in large-scale learning applications. Our technique utilizes a curvature-aided gradient tracking step to produce accurate gradient estimates incrementally using Hessian information. We propose and analyze two methods utilizing the new technique, the curvature-aided IAG ({\sf CIAG}) method and the accelerated CIAG ({\sf A-CIAG}) method, which are analogous to gradient method and Nesterov's accelerated gradient method, respectively. Setting $κ$ to be the condition number of the objective function, we prove the $R$ linear convergence rates of $1 - \frac{4c_0 κ}{(κ+1)^2}$ for the {\sf CIAG} method, and $1 - \sqrt{\frac{c_1}{2κ}}$ for the {\sf A-CIAG} method, where $c_0,c_1 \leq 1$ are constants inversely proportional to the distance between the initial point and the optimal solution. When the initial iterate is close to the optimal solution, the $R$ linear convergence rates match with the gradient and accelerated gradient method, albeit {\sf CIAG} and {\sf A-CIAG} operate in an incremental setting with strictly lower computation complexity. Numerical experiments confirm our findings. The source codes used for this paper can be found on \url{http://github.com/hoitowai/ciag/}.
OCJun 28, 2021
Robust Distributed Optimization With Randomly Corrupted GradientsBerkay Turan, Cesar A. Uribe, Hoi-To Wai et al.
In this paper, we propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior, where all the participating agents are prone to failure. We model each agent's state over time as a two-state Markov chain that indicates Byzantine or trustworthy behaviors at different time instants. We set no restrictions on the maximum number of Byzantine agents at any given time. We design our method based on three layers of defense: 1) temporal robust aggregation, 2) spatial robust aggregation, and 3) gradient normalization. We study two settings for stochastic optimization, namely Sample Average Approximation and Stochastic Approximation. We provide convergence guarantees of our method for strongly convex and smooth non-convex cost functions.
AINov 20, 2020
A General Framework for Distributed Inference with Uncertain ModelsJames Z. Hare, Cesar A. Uribe, Lance Kaplan et al.
This paper studies the problem of distributed classification with a network of heterogeneous agents. The agents seek to jointly identify the underlying target class that best describes a sequence of observations. The problem is first abstracted to a hypothesis-testing framework, where we assume that the agents seek to agree on the hypothesis (target class) that best matches the distribution of observations. Non-Bayesian social learning theory provides a framework that solves this problem in an efficient manner by allowing the agents to sequentially communicate and update their beliefs for each hypothesis over the network. Most existing approaches assume that agents have access to exact statistical models for each hypothesis. However, in many practical applications, agents learn the likelihood models based on limited data, which induces uncertainty in the likelihood function parameters. In this work, we build upon the concept of uncertain models to incorporate the agents' uncertainty in the likelihoods by identifying a broad set of parametric distribution that allows the agents' beliefs to converge to the same result as a centralized approach. Furthermore, we empirically explore extensions to non-parametric models to provide a generalized framework of uncertain models in non-Bayesian social learning.
AISep 9, 2019
Non-Bayesian Social Learning with Uncertain ModelsJames Z. Hare, Cesar A. Uribe, Lance Kaplan et al.
Non-Bayesian social learning theory provides a framework that models distributed inference for a group of agents interacting over a social network. In this framework, each agent iteratively forms and communicates beliefs about an unknown state of the world with their neighbors using a learning rule. Existing approaches assume agents have access to precise statistical models (in the form of likelihoods) for the state of the world. However in many situations, such models must be learned from finite data. We propose a social learning rule that takes into account uncertainty in the statistical models using second-order probabilities. Therefore, beliefs derived from uncertain models are sensitive to the amount of past evidence collected for each hypothesis. We characterize how well the hypotheses can be tested on a social network, as consistent or not with the state of the world. We explicitly show the dependency of the generated beliefs with respect to the amount of prior evidence. Moreover, as the amount of prior evidence goes to infinity, learning occurs and is consistent with traditional social learning theory.