Usman A. Khan

h-index34

22papers

754citations

Novelty44%

AI Score36

Ranked #99,298 of 194,257 authors (top 51%)#650 in SY (top 40%)

22 Papers

1.2SYFeb 12, 2018

Localization in internets of mobile agents: A linear approach

Sam Safavi, Usman A. Khan, Soummya Kar et al. · cmu

Fifth generation~(5G) networks providing much higher bandwidth and faster data rates will allow connecting vast number of static and mobile devices, sensors, agents, users, machines, and vehicles, supporting Internet-of-Things (IoT), real-time dynamic networks of mobile things. Positioning and location awareness will become increasingly important, enabling deployment of new services and contributing to significantly improving the overall performance of the 5G~system. Many of the currently talked about solutions to positioning in~5G are centralized, mostly requiring direct line-of-sight (LoS) to deployed access nodes or anchors at the same time, which in turn requires high-density deployments of anchors. But these LoS and centralized positioning solutions may become unwieldy as the number of users and devices continues to grow without limit in sight. As an alternative to the centralized solutions, this paper discusses distributed localization in a 5G enabled IoT environment where many low power devices, users, or agents are to locate themselves without global or LoS access to anchors. Even though positioning is essentially a non-linear problem (solving circle equations by trilateration or triangulation), we discuss a cooperative \textit{linear} distributed iterative solution with only local measurements, communication and computation needed at each agent. Linearity is obtained by reparametrization of the agent location through barycentric coordinate representations based on local neighborhood geometry that may be computed in terms of certain Cayley-Menger determinants involving relative local inter-agent distance measurements.

1.2SYJan 21, 2017

Localization in mobile networks via virtual convex hulls

Sam Safavi, Usman A. Khan

In this paper, we develop a \textit{distributed} algorithm to localize an arbitrary number of agents moving in a bounded region of interest. We assume that the network contains \textit{at least one} agent with known location (hereinafter referred to as an anchor), and each agent measures a noisy version of its motion and the distances to the nearby agents. We provide a~\emph{geometric approach}, which allows each agent to: (i) continually update the distances to the locations where it has exchanged information with the other nodes in the past; and (ii) measure the distance between a neighbor and any such locations. Based on this approach, we provide a \emph{linear update} to find the locations of an arbitrary number of mobile agents when they follow some convexity in their deployment and motion. Since the agents are mobile, they may not be able to find nearby nodes (agents and/or anchors) to implement a distributed algorithm. To address this issue, we introduce the notion of a \emph{virtual convex hull} with the help of the aforementioned geometric approach. In particular, each agent keeps track of a virtual convex hull of other nodes, which may not physically exist, and updates its location with respect to its neighbors in the virtual hull. We show that the corresponding localization algorithm, in the absence of noise, can be abstracted as a Linear Time-Varying (LTV) system, with non-deterministic system matrices, which asymptotically tracks the true locations of the agents. We provide simulations to verify the analytical results and evaluate the performance of the algorithm in the presence of noise on the motion as well as on the distance measurements.

3.2ROAug 28, 2025

Multi-robot Path Planning and Scheduling via Model Predictive Optimal Transport (MPC-OT)

Usman A. Khan, Mouhacine Benosman, Wenliang Liu et al.

In this paper, we propose a novel methodology for path planning and scheduling for multi-robot navigation that is based on optimal transport theory and model predictive control. We consider a setup where $N$ robots are tasked to navigate to $M$ targets in a common space with obstacles. Mapping robots to targets first and then planning paths can result in overlapping paths that lead to deadlocks. We derive a strategy based on optimal transport that not only provides minimum cost paths from robots to targets but also guarantees non-overlapping trajectories. We achieve this by discretizing the space of interest into $K$ cells and by imposing a ${K\times K}$ cost structure that describes the cost of transitioning from one cell to another. Optimal transport then provides \textit{optimal and non-overlapping} cell transitions for the robots to reach the targets that can be readily deployed without any scheduling considerations. The proposed solution requires $\unicode{x1D4AA}(K^3\log K)$ computations in the worst-case and $\unicode{x1D4AA}(K^2\log K)$ for well-behaved problems. To further accommodate potentially overlapping trajectories (unavoidable in certain situations) as well as robot dynamics, we show that a temporal structure can be integrated into optimal transport with the help of \textit{replans} and \textit{model predictive control}.

8.2OCFeb 11, 2022

Distributed saddle point problems for strongly concave-convex functions

Muhammad I. Qureshi, Usman A. Khan

In this paper, we propose GT-GDA, a distributed optimization method to solve saddle point problems of the form: $\min_{\mathbf{x}} \max_{\mathbf{y}} \{F(\mathbf{x},\mathbf{y}) :=G(\mathbf{x}) + \langle \mathbf{y}, \overline{P} \mathbf{x} \rangle - H(\mathbf{y})\}$, where the functions $G(\cdot)$, $H(\cdot)$, and the the coupling matrix $\overline{P}$ are distributed over a strongly connected network of nodes. GT-GDA is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, GT-GDA includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant GT-GDA-Lite that does not incur the additional communication and analyze its convergence in various scenarios. We show that GT-GDA converges linearly to the unique saddle point solution when $G(\cdot)$ is smooth and convex, $H(\cdot)$ is smooth and strongly convex, and the global coupling matrix $\overline{P}$ has full column rank. We further characterize the regime under which GT-GDA exhibits a network topology-independent convergence behavior. We next show the linear convergence of GT-GDA to an error around the unique saddle point, which goes to zero when the coupling cost ${\langle \mathbf y, \overline{P} \mathbf x \rangle}$ is common to all nodes, or when $G(\cdot)$ and $H(\cdot)$ are quadratic. Numerical experiments illustrate the convergence properties and importance of GT-GDA and GT-GDA-Lite for several applications.

5.6OCFeb 7, 2022

Variance reduced stochastic optimization over directed graphs with row and column stochastic weights

Muhammad I. Qureshi, Ran Xin, Soummya Kar et al.

This paper proposes AB-SAGA, a first-order distributed stochastic optimization method to minimize a finite-sum of smooth and strongly convex functions distributed over an arbitrary directed graph. AB-SAGA removes the uncertainty caused by the stochastic gradients using a node-level variance reduction and subsequently employs network-level gradient tracking to address the data dissimilarity across the nodes. Unlike existing methods that use the nonlinear push-sum correction to cancel the imbalance caused by the directed communication, the consensus updates in AB-SAGA are linear and uses both row and column stochastic weights. We show that for a constant step-size, AB-SAGA converges linearly to the global optimal. We quantify the directed nature of the underlying graph using an explicit directivity constant and characterize the regimes in which AB-SAGA achieves a linear speed-up over its centralized counterpart. Numerical experiments illustrate the convergence of AB-SAGA for strongly convex and nonconvex problems.

6.6SYApr 1, 2021

Distributed support-vector-machine over dynamic balanced directed networks

Mohammadreza Doostmohammadian, Alireza Aghasi, Themistoklis Charalambous et al.

In this paper, we consider the binary classification problem via distributed Support-Vector-Machines (SVM), where the idea is to train a network of agents, with limited share of data, to cooperatively learn the SVM classifier for the global database. Agents only share processed information regarding the classifier parameters and the gradient of the local loss functions instead of their raw data. In contrast to the existing work, we propose a continuous-time algorithm that incorporates network topology changes in discrete jumps. This hybrid nature allows us to remove chattering that arises because of the discretization of the underlying CT process. We show that the proposed algorithm converges to the SVM classifier over time-varying weight balanced directed graphs by using arguments from the matrix perturbation theory.

17.7OCFeb 12, 2021

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

Ran Xin, Usman A. Khan, Soummya Kar

This paper considers decentralized stochastic optimization over a network of $n$ nodes, where each node possesses a smooth non-convex local cost function and the goal of the networked nodes is to find an $ε$-accurate first-order stationary point of the sum of the local costs. We focus on an online setting, where each node accesses its local cost only by means of a stochastic first-order oracle that returns a noisy version of the exact gradient. In this context, we propose a novel single-loop decentralized hybrid variance-reduced stochastic gradient method, called GT-HSGD, that outperforms the existing approaches in terms of both the oracle complexity and practical implementation. The GT-HSGD algorithm implements specialized local hybrid stochastic gradient estimators that are fused over the network to track the global gradient. Remarkably, GT-HSGD achieves a network topology-independent oracle complexity of $O(n^{-1}ε^{-3})$ when the required error tolerance $ε$ is small enough, leading to a linear speedup with respect to the centralized optimal online variance-reduced approaches that operate on a single node. Numerical experiments are provided to illustrate our main technical results.

5.1SYDec 15, 2020

Fast-Convergent Dynamics for Distributed Allocation of Resources Over Switching Sparse Networks with Quantized Communication Links

Mohammadreza Doostmohammadian, Alireza Aghasi, Mohammad Pirani et al.

This paper proposes networked dynamics to solve resource allocation problems over time-varying multi-agent networks. The state of each agent represents the amount of used resources (or produced utilities) while the total amount of resources is fixed. The idea is to optimally allocate the resources among the group of agents by minimizing the overall cost function subject to fixed sum of resources. Each agents' information is restricted to its own state and cost function and those of its immediate in-neighbors. This is motivated by distributed applications such as mobile edge-computing, economic dispatch over smart grids, and multi-agent coverage control. This work provides a fast convergent solution (in comparison with linear dynamics) while considering relaxed network connectivity with quantized communication links. The proposed dynamics reaches optimal solution over switching (possibly disconnected) undirected networks as far as their union over some bounded non-overlapping time-intervals has a spanning-tree. We prove feasibility of the solution, uniqueness of the optimal state, and convergence to the optimal value under the proposed dynamics, where the analysis is applicable to similar 1st-order allocation dynamics with strongly sign-preserving nonlinearities, such as actuator saturation.

12.6OCNov 7, 2020

A fast randomized incremental gradient method for decentralized non-convex optimization

Ran Xin, Usman A. Khan, Soummya Kar

We study decentralized non-convex finite-sum minimization problems described over a network of nodes, where each node possesses a local batch of data samples. In this context, we analyze a single-timescale randomized incremental gradient method, called GT-SAGA. GT-SAGA is computationally efficient as it evaluates one component gradient per node per iteration and achieves provably fast and robust performance by leveraging node-level variance reduction and network-level gradient tracking. For general smooth non-convex problems, we show the almost sure and mean-squared convergence of GT-SAGA to a first-order stationary point and further describe regimes of practical significance where it outperforms the existing approaches and achieves a network topology-independent iteration complexity respectively. When the global function satisfies the Polyak-Lojaciewisz condition, we show that GT-SAGA exhibits linear convergence to an optimal solution in expectation and describe regimes of practical interest where the performance is network topology-independent and improves upon the existing methods. Numerical experiments are included to highlight the main convergence aspects of GT-SAGA in non-convex settings.

16.5LGSep 12, 2020

A general framework for decentralized optimization with first-order methods

Ran Xin, Shi Pu, Angelia Nedić et al.

Decentralized optimization to minimize a finite sum of functions over a network of nodes has been a significant focus within control and signal processing research due to its natural relevance to optimal control and signal estimation problems. More recently, the emergence of sophisticated computing and large-scale data science needs have led to a resurgence of activity in this area. In this article, we discuss decentralized first-order gradient methods, which have found tremendous success in control, signal processing, and machine learning problems, where such methods, due to their simplicity, serve as the first method of choice for many complex inference and training tasks. In particular, we provide a general framework of decentralized first-order methods that is applicable to undirected and directed communication networks alike, and show that much of the existing work on optimization and consensus can be related explicitly to this framework. We further extend the discussion to decentralized stochastic first-order methods that rely on stochastic gradients at each node and describe how local variance reduction schemes, previously shown to have promise in the centralized settings, are able to improve the performance of decentralized methods when combined with what is known as gradient tracking. We motivate and demonstrate the effectiveness of the corresponding methods in the context of machine learning and signal processing problems that arise in decentralized environments.

14.8OCAug 17, 2020

Fast decentralized non-convex finite-sum optimization with recursive variance reduction

Ran Xin, Usman A. Khan, Soummya Kar

This paper considers decentralized minimization of $N:=nm$ smooth non-convex cost functions equally divided over a directed network of $n$ nodes. Specifically, we describe a stochastic first-order gradient method, called GT-SARAH, that employs a SARAH-type variance reduction technique and gradient tracking (GT) to address the stochastic and decentralized nature of the problem. We show that GT-SARAH, with appropriate algorithmic parameters, finds an $ε$-accurate first-order stationary point with $O\big(\max\big\{N^{\frac{1}{2}},n(1-λ)^{-2},n^{\frac{2}{3}}m^{\frac{1}{3}}(1-λ)^{-1}\big\}Lε^{-2}\big)$ gradient complexity, where ${(1-λ)\in(0,1]}$ is the spectral gap of the network weight matrix and $L$ is the smoothness parameter of the cost functions. This gradient complexity outperforms that of the existing decentralized stochastic gradient methods. In particular, in a big-data regime such that ${n = O(N^{\frac{1}{2}}(1-λ)^{3})}$, this gradient complexity furthers reduces to ${O(N^{\frac{1}{2}}Lε^{-2})}$, independent of the network topology, and matches that of the centralized near-optimal variance-reduced methods. Moreover, in this regime GT-SARAH achieves a non-asymptotic linear speedup, in that, the total number of gradient computations at each node is reduced by a factor of $1/n$ compared to the centralized near-optimal algorithms that perform all gradient computations at a single node. To the best of our knowledge, GT-SARAH is the first algorithm that achieves this property. In addition, we show that appropriate choices of local minibatch size balance the trade-offs between the gradient and communication complexity of GT-SARAH. Over infinite time horizon, we establish that all nodes in GT-SARAH asymptotically achieve consensus and converge to a first-order stationary point in the almost sure and mean-squared sense.

7.9LGAug 13, 2020

Push-SAGA: A decentralized stochastic algorithm with variance reduction over directed graphs

Muhammad I. Qureshi, Ran Xin, Soummya Kar et al.

In this paper, we propose Push-SAGA, a decentralized stochastic first-order method for finite-sum minimization over a directed network of nodes. Push-SAGA combines node-level variance reduction to remove the uncertainty caused by stochastic gradients, network-level gradient tracking to address the distributed nature of the data, and push-sum consensus to tackle the challenge of directed communication links. We show that Push-SAGA achieves linear convergence to the exact solution for smooth and strongly convex problems and is thus the first linearly-convergent stochastic algorithm over arbitrary strongly connected directed graphs. We also characterize the regimes in which Push-SAGA achieves a linear speed-up compared to its centralized counterpart and achieves a network-independent convergence rate. We illustrate the behavior and convergence properties of Push-SAGA with the help of numerical experiments on strongly convex and non-convex problems.

25.3OCAug 10, 2020

An improved convergence analysis for decentralized online stochastic non-convex optimization

Ran Xin, Usman A. Khan, Soummya Kar

In this paper, we study decentralized online stochastic non-convex optimization over a network of nodes. Integrating a technique called gradient tracking in decentralized stochastic gradient descent, we show that the resulting algorithm, GT-DSGD, enjoys certain desirable characteristics towards minimizing a sum of smooth non-convex functions. In particular, for general smooth non-convex functions, we establish non-asymptotic characterizations of GT-DSGD and derive the conditions under which it achieves network-independent performances that match the centralized minibatch SGD. In contrast, the existing results suggest that GT-DSGD is always network-dependent and is therefore strictly worse than the centralized minibatch SGD. When the global non-convex function additionally satisfies the Polyak-Lojasiewics (PL) condition, we establish the linear convergence of GT-DSGD up to a steady-state error with appropriate constant step-sizes. Moreover, under stochastic approximation step-sizes, we establish, for the first time, the optimal global sublinear convergence rate on almost every sample path, in addition to the asymptotically optimal sublinear rate in expectation. Since strongly convex functions are a special case of the functions satisfying the PL condition, our results are not only immediately applicable but also improve the currently known best convergence rates and their dependence on problem parameters.

10.1LGMay 15, 2020

S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs

Muhammad I. Qureshi, Ran Xin, Soummya Kar et al.

In this report, we study decentralized stochastic optimization to minimize a sum of smooth and strongly convex cost functions when the functions are distributed over a directed network of nodes. In contrast to the existing work, we use gradient tracking to improve certain aspects of the resulting algorithm. In particular, we propose the~\textbf{\texttt{S-ADDOPT}} algorithm that assumes a stochastic first-order oracle at each node and show that for a constant step-size~$α$, each node converges linearly inside an error ball around the optimal solution, the size of which is controlled by~$α$. For decaying step-sizes~$\mathcal{O}(1/k)$, we show that~\textbf{\texttt{S-ADDOPT}} reaches the exact solution sublinearly at~$\mathcal{O}(1/k)$ and its convergence is asymptotically network-independent. Thus the asymptotic behavior of~\textbf{\texttt{S-ADDOPT}} is comparable to the centralized stochastic gradient descent. Numerical experiments over both strongly convex and non-convex problems illustrate the convergence behavior and the performance comparison of the proposed algorithm.

4.2LGFeb 13, 2020

Gradient tracking and variance reduction for decentralized optimization and machine learning

Ran Xin, Soummya Kar, Usman A. Khan

Decentralized methods to solve finite-sum minimization problems are important in many signal processing and machine learning tasks where the data is distributed over a network of nodes and raw data sharing is not permitted due to privacy and/or resource constraints. In this article, we review decentralized stochastic first-order methods and provide a unified algorithmic framework that combines variance-reduction with gradient tracking to achieve both robust performance and fast convergence. We provide explicit theoretical guarantees of the corresponding methods when the objective functions are smooth and strongly-convex, and show their applicability to non-convex problems via numerical experiments. Throughout the article, we provide intuitive illustrations of the main technical ideas by casting appropriate tradeoffs and comparisons among the methods of interest and by highlighting applications to decentralized training of machine learning models.

14.1OCOct 8, 2019

Variance-Reduced Decentralized Stochastic Optimization with Gradient Tracking -- Part II: GT-SVRG

Ran Xin, Usman A. Khan, Soummya Kar

Decentralized stochastic optimization has recently benefited from gradient tracking methods \cite{DSGT_Pu,DSGT_Xin} providing efficient solutions for large-scale empirical risk minimization problems. In Part I \cite{GT_SAGA} of this work, we develop \textbf{\texttt{GT-SAGA}} that is based on a decentralized implementation of SAGA \cite{SAGA} using gradient tracking and discuss regimes of practical interest where \textbf{\texttt{GT-SAGA}} outperforms existing decentralized approaches in terms of the total number of local gradient computations. In this paper, we describe \textbf{\texttt{GT-SVRG}} that develops a decentralized gradient tracking based implementation of SVRG \cite{SVRG}, another well-known variance-reduction technique. We show that the convergence rate of \textbf{\texttt{GT-SVRG}} matches that of \textbf{\texttt{GT-SAGA}} for smooth and strongly-convex functions and highlight different trade-offs between the two algorithms in various settings.

9.1LGJul 23, 2019

An introduction to decentralized stochastic optimization with gradient tracking

Ran Xin, Soummya Kar, Usman A. Khan

Decentralized solutions to finite-sum minimization are of significant importance in many signal processing, control, and machine learning applications. In such settings, the data is distributed over a network of arbitrarily-connected nodes and raw data sharing is prohibitive often due to communication or privacy constraints. In this article, we review decentralized stochastic first-order optimization methods and illustrate some recent improvements based on gradient tracking and variance reduction, focusing particularly on smooth and strongly-convex objective functions. We provide intuitive illustrations of the main technical ideas as well as applications of the algorithms in the context of decentralized training of machine learning models.

14.0LGMar 18, 2019

Distributed stochastic optimization with gradient tracking over strongly-connected networks

Ran Xin, Anit Kumar Sahu, Usman A. Khan et al.

In this paper, we study distributed stochastic optimization to minimize a sum of smooth and strongly-convex local cost functions over a network of agents, communicating over a strongly-connected graph. Assuming that each agent has access to a stochastic first-order oracle ($\mathcal{SFO}$), we propose a novel distributed method, called $\mathcal{S}$-$\mathcal{AB}$, where each agent uses an auxiliary variable to asymptotically track the gradient of the global cost in expectation. The $\mathcal{S}$-$\mathcal{AB}$ algorithm employs row- and column-stochastic weights simultaneously to ensure both consensus and optimality. Since doubly-stochastic weights are not used, $\mathcal{S}$-$\mathcal{AB}$ is applicable to arbitrary strongly-connected graphs. We show that under a sufficiently small constant step-size, $\mathcal{S}$-$\mathcal{AB}$ converges linearly (in expected mean-square sense) to a neighborhood of the global minimizer. We present numerical simulations based on real-world data sets to illustrate the theoretical results.

1.2SYMay 5, 2019

On the Controllability of Clustered Scale-Free Networks

Mohammadreza Doostmohammadian, Usman A. Khan

In this paper, we compare the number of unmatched nodes and the size of dilations in two main random network models, the Scale-Free and Clustered Scale-Free networks. The number of unmatched nodes determines the necessary number of control inputs and is known to be a measure for network controllability, while the size of dilation is a measure of controllability recovery in case of control input failure. Our results show that clustered version of Scale-Free networks require fewer control inputs for controllability. Further, the average size of dilations is smaller in clustered Scale-Free networks, implying that potentially fewer options for controllability recovery are available.

1.2SYSep 12, 2017

Observational Equivalence in System Estimation: Contractions in Complex Networks

Mohammadreza Doostmohammadian, Hamid R. Rabiee, Houman Zarrabi et al.

Observability of complex systems/networks is the focus of this paper, which is shown to be closely related to the concept of contraction. Indeed, for observable network tracking it is necessary/sufficient to have one node in each contraction measured. Therefore, nodes in a contraction are equivalent to recover for loss of observability, implying that contraction size is a key factor for observability recovery. Here, using a polynomial order contraction detection algorithm, we analyze the distribution of contractions, studying its relation with key network properties. Our results show that contraction size is related to network clustering coefficient and degree heterogeneity. Particularly, in networks with power-law degree distribution, if the clustering coefficient is high there are less contractions with smaller size on average. The implication is that estimation/tracking of such systems requires less number of measurements, while their observational recovery is more restrictive in case of sensor failure. Further, in Small-World networks higher degree heterogeneity implies that there are more contractions with smaller size on average. Therefore, the estimation of representing system requires more measurements, and also the recovery of measurement failure is more limited. These results imply that one can tune the properties of synthetic networks to alleviate their estimation/observability recovery.

4.5ROMar 19, 2017

An opportunistic linear-convex algorithm for localization in mobile robot networks

Sam Safavi, Usman Khan

In this paper, we develop a \textcolor{black}{\emph{distributed}} algorithm to localize a network of robots moving arbitrarily in a bounded region. In the case of such mobile networks, the main challenge is that the robots may not be able to find nearby robots to implement a distributed algorithm. We address this issue by providing an opportunistic algorithm that only implements a location update when there are nearby robots and does not update otherwise. We assume that each robot measures a noisy version of its motion and the distances to the nearby robots. To localize a network of mobile robots in~$\mathbb{R}^m$, we provide a simple \emph{linear} update, which is based on barycentric coordinates and is linear-convex. We abstract the corresponding localization algorithm as a Linear Time-Varying (LTV) system and show that it asymptotically converges to the true locations~of~the robots. We first focus on the noiseless case, where the distance and motion vectors are known (measured) perfectly, and provide sufficient conditions on the convergence of the algorithm. We then evaluate the performance of the algorithm in the presence of noise and provide modifications to counter the undesirable effects of noise. \textcolor{black}{We further show that our algorithm precisely tracks a mobile network as long as there is at least one known beacon (a node whose location is perfectly known).

1.2SYDec 27, 2014

Asymptotic stability of stochastic LTV systems with applications to distributed dynamic fusion

Sam Safavi, Usman A. Khan

In this paper, we investigate asymptotic stability of linear time-varying systems with (sub-) stochastic system matrices. Motivated by distributed dynamic fusion over networks of mobile agents, we impose some mild regularity conditions on the elements of time-varying system matrices. We provide sufficient conditions under which the asymptotic stability of the LTV system can be guaranteed. By introducing the notion of slices, as non-overlapping partitions of the sequence of systems matrices, we obtain stability conditions in terms of the slice lengths and some network parameters. In addition, we apply the LTV stability results to the distributed leader-follower algorithm, and show the corresponding convergence and steady-state. An illustrative example is also included to validate the effectiveness of our approach.