M. Rostami

LG
h-index2
4papers
11citations
Novelty50%
AI Score36

4 Papers

OCOct 11, 2023
First-Order Dynamic Optimization for Streaming Convex Costs

M. Rostami, H. Moradian, S. S. Kia

This paper proposes a set of novel optimization algorithms for solving a class of convex optimization problems with time-varying streaming cost function. We develop an approach to track the optimal solution with a bounded error. Unlike the existing results, our algorithm is executed only by using the first-order derivatives of the cost function which makes it computationally efficient for optimization with time-varying cost function. We compare our algorithms to the gradient descent algorithm and show why gradient descent is not an effective solution for optimization problems with time-varying cost. Several examples including solving a model predictive control problem cast as a convex optimization problem with a streaming time-varying cost function demonstrate our results.

19.0LGApr 3
FedScalar: Federated Learning with Scalar Communication for Bandwidth-Constrained Networks

M. Rostami, S. S. Kia

In bandwidth-constrained federated learning~(FL) settings, the repeated upload of high-dimensional model updates from agents to a central server constitutes the primary bottleneck, often rendering standard FL infeasible within practical communication budgets. We propose \emph{FedScalar}, a communication-efficient FL algorithm in which each agent uploads only two scalar values per round, regardless of the model dimension~$d$. Each agent encodes its local update difference as an inner product with a locally generated random vector and transmits the resulting scalar together with the generating seed, enabling the server to reconstruct an unbiased gradient estimate without any high-dimensional transmission. We prove that \emph{FedScalar} achieves a convergence rate of $O(d/\sqrt{K})$ to a stationary point for smooth non-convex loss functions, and show that adopting a Rademacher distribution for the random vector reduces the aggregation variance compared to the Gaussian case. Numerical simulations confirm that the dimension-free upload cost translates into significant improvements in wall-clock time and energy efficiency over \emph{FedAvg} and \emph{QSGD} in bandwidth-constrained settings.

OCOct 19, 2024
Time-Varying Convex Optimization with $O(n)$ Computational Complexity

M. Rostami, S. S. Kia

In this article, we consider the problem of unconstrained time-varying convex optimization, where the cost function changes with time. We provide an in-depth technical analysis of the problem and argue why freezing the cost at each time step and taking finite steps toward the minimizer is not the best tracking solution for this problem. We propose a set of algorithms that by taking into account the temporal variation of the cost aim to reduce the tracking error of the time-varying minimizer of the problem. The main contribution of our work is that our proposed algorithms only require the first-order derivatives of the cost function with respect to the decision variable. This approach significantly reduces computational cost compared to the existing algorithms, which use the inverse of the Hessian of the cost. Specifically, the proposed algorithms reduce the computational cost from $O(n^3)$ to $O(n)$ per timestep, where $n$ is the size of the decision variable. Avoiding the inverse of the Hessian also makes our algorithms applicable to non-convex optimization problems. We refer to these algorithms as $O(n)$-algorithms. These $O(n)$-algorithms are designed to solve the problem for different scenarios based on the available temporal information about the cost. We illustrate our results through various examples, including the solution of a model predictive control problem framed as a convex optimization problem with a streaming time-varying cost function.

LGMar 19, 2024
Projected Forward Gradient-Guided Frank-Wolfe Algorithm via Variance Reduction

M. Rostami, S. S. Kia

This paper aims to enhance the use of the Frank-Wolfe (FW) algorithm for training deep neural networks. Similar to any gradient-based optimization algorithm, FW suffers from high computational and memory costs when computing gradients for DNNs. This paper introduces the application of the recently proposed projected forward gradient (Projected-FG) method to the FW framework, offering reduced computational cost similar to backpropagation and low memory utilization akin to forward propagation. Our results show that trivial application of the Projected-FG introduces non-vanishing convergence error due to the stochastic noise that the Projected-FG method introduces in the process. This noise results in an non-vanishing variance in the Projected-FG estimated gradient. To address this, we propose a variance reduction approach by aggregating historical Projected-FG directions. We demonstrate rigorously that this approach ensures convergence to the optimal solution for convex functions and to a stationary point for non-convex functions. These convergence properties are validated through a numerical example, showcasing the approach's effectiveness and efficiency.