OCMay 23, 2011
On Stochastic Gradient and Subgradient Methods with Adaptive Steplength SequencesFarzad Yousefian, Angelia Nedić, Uday V. Shanbhag
The performance of standard stochastic approximation implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, in the first part of the paper, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory. The first scheme, referred to as a recursive steplength stochastic approximation scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. This rule is seen to lead to the optimal steplength sequence over a prescribed set of choices. The second scheme, termed as a cascading steplength stochastic approximation scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. In the second part of the paper, we allow for nondifferentiable objective and we propose a local smoothing technique that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution. The resulting adaptive steplength schemes are applied to three stochastic optimization problems. We observe that both schemes perform well in practice and display markedly less reliance on user-defined parameters.
LGMar 21
Incentive-Aware Federated Averaging with Performance Guarantees under Strategic ParticipationFateme Maleki, Krishnan Raghavan, Farzad Yousefian
Federated learning (FL) is a communication-efficient collaborative learning framework that enables model training across multiple agents with private local datasets. While the benefits of FL in improving global model performance are well established, individual agents may behave strategically, balancing the learning payoff against the cost of contributing their local data. Motivated by the need for FL frameworks that successfully retain participating agents, we propose an incentive-aware federated averaging method in which, at each communication round, clients transmit both their local model parameters and their updated training dataset sizes to the server. The dataset sizes are dynamically adjusted via a Nash equilibrium (NE)-seeking update rule that captures strategic data participation. We analyze the proposed method under convex and nonconvex global objective settings and establish performance guarantees for the resulting incentive-aware FL algorithm. Numerical experiments on the MNIST and CIFAR-10 datasets demonstrate that agents achieve competitive global model performance while converging to stable data participation strategies.
LGMar 20
On Performance Guarantees for Federated Learning with Personalized ConstraintsMohammadjavad Ebrahimi, Daniel Burbano, Farzad Yousefian
Federated learning (FL) has emerged as a communication-efficient algorithmic framework for distributed learning across multiple agents. While standard FL formulations capture unconstrained or globally constrained problems, many practical settings involve heterogeneous resource or model constraints, leading to optimization problems with agent-specific feasible sets. Here, we study a personalized constrained federated optimization problem in which each agent is associated with a convex local objective and a private constraint set. We propose PC-FedAvg, a method in which each agent maintains cross-estimates of the other agents' variables through a multi-block local decision vector. Each agent updates all blocks locally, penalizing infeasibility only in its own block. Moreover, the cross-estimate mechanism enables personalization without requiring consensus or sharing constraint information among agents. We establish communication-complexity rates of $\mathcal{O}(ε^{-2})$ for suboptimality and $\mathcal{O}(ε^{-1})$ for agent-wise infeasibility. Preliminary experiments on the MNIST and CIFAR-10 datasets validate our theoretical findings.
OCApr 2, 2025
A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated LearningYuyang Qiu, Kibaek Kim, Farzad Yousefian
Heterogeneity in federated learning (FL) is a critical and challenging aspect that significantly impacts model performance and convergence. In this paper, we propose a novel framework by formulating heterogeneous FL as a hierarchical optimization problem. This new framework captures both local and global training processes through a bilevel formulation and is capable of the following: (i) addressing client heterogeneity through a personalized learning framework; (ii) capturing the pre-training process on the server side; (iii) updating the global model through nonstandard aggregation; (iv) allowing for nonidentical local steps; and (v) capturing clients' local constraints. We design and analyze an implicit zeroth-order FL method (ZO-HFL), equipped with nonasymptotic convergence guarantees for both the server-agent and the individual client-agents, and asymptotic guarantees for both the server-agent and client-agents in an almost sure sense. Notably, our method does not rely on standard assumptions in heterogeneous FL, such as the bounded gradient dissimilarity condition. We implement our method on image classification tasks and compare with other methods under different heterogeneous settings.