MLNov 2, 2024
Federated Learning with Relative FairnessShogo Nakakita, Tatsuya Kaneko, Shinya Takamaeda-Yamazaki et al.
This paper proposes a federated learning framework designed to achieve \textit{relative fairness} for clients. Traditional federated learning frameworks typically ensure absolute fairness by guaranteeing minimum performance across all client subgroups. However, this approach overlooks disparities in model performance between subgroups. The proposed framework uses a minimax problem approach to minimize relative unfairness, extending previous methods in distributionally robust optimization (DRO). A novel fairness index, based on the ratio between large and small losses among clients, is introduced, allowing the framework to assess and improve the relative fairness of trained models. Theoretical guarantees demonstrate that the framework consistently reduces unfairness. We also develop an algorithm, named \textsc{Scaff-PD-IA}, which balances communication and computational efficiency while maintaining minimax-optimal convergence rates. Empirical evaluations on real-world datasets confirm its effectiveness in maintaining model performance while reducing disparity.
MLMay 22, 2025
Sharp concentration of uniform generalization errors in binary linear classificationShogo Nakakita
We examine the concentration of uniform generalization errors around their expectation in binary linear classification problems via an isoperimetric argument. In particular, we establish Poincaré and log-Sobolev inequalities for the joint distribution of the output labels and the label-weighted input vectors, which we apply to derive concentration bounds. The derived concentration bounds are sharp up to moderate multiplicative constants by those under well-balanced labels. In asymptotic analysis, we also show that almost sure convergence of uniform generalization errors to their expectation occurs in very broad settings, such as proportionally high-dimensional regimes. Using this convergence, we establish uniform laws of large numbers under dimension-free conditions.
MLJun 23, 2024
Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary DistributionNaoki Yoshida, Shogo Nakakita, Masaaki Imaizumi
We consider a variant of the stochastic gradient descent (SGD) with a random learning rate and reveal its convergence properties. SGD is a widely used stochastic optimization algorithm in machine learning, especially deep learning. Numerous studies reveal the convergence properties of SGD and its theoretically favorable variants. Among these, the analysis of convergence using a stationary distribution of updated parameters provides generalizable results. However, to obtain a stationary distribution, the update direction of the parameters must not degenerate, which limits the applicable variants of SGD. In this study, we consider a novel SGD variant, Poisson SGD, which has degenerated parameter update directions and instead utilizes a random learning rate. Consequently, we demonstrate that a distribution of a parameter updated by Poisson SGD converges to a stationary distribution under weak assumptions on a loss function. Based on this, we further show that Poisson SGD finds global minima in non-convex optimization problems and also evaluate the generalization error using this method. As a proof technique, we approximate the distribution by Poisson SGD with that of the bouncy particle sampler (BPS) and derive its stationary distribution, using the theoretical advance of the piece-wise deterministic Markov process (PDMP).