Decentralized Stochastic Gradient Descent Ascent for Finite-Sum Minimax Problems
This work addresses a gap in distributed machine learning for minimax problems, providing the first method with specific theoretical complexities for decentralized settings, which is incremental but offers concrete improvements for researchers and practitioners in optimization.
The paper tackles the problem of solving finite-sum minimax optimization in a decentralized setting where data is distributed across multiple workers, by developing a novel decentralized stochastic gradient descent ascent method with variance-reduced gradients, achieving sample complexity of O(√nκ³/((1-λ)²ε²)) and communication complexity of O(κ³/((1-λ)²ε²)) for nonconvex-strongly-concave problems.
Minimax optimization problems have attracted significant attention in recent years due to their widespread application in numerous machine learning models. To solve the minimax problem, a wide variety of stochastic optimization methods have been proposed. However, most of them ignore the distributed setting where the training data is distributed on multiple workers. In this paper, we developed a novel decentralized stochastic gradient descent ascent method for the finite-sum minimax problem. In particular, by employing the variance-reduced gradient, our method can achieve $O(\frac{\sqrt{n}κ^3}{(1-λ)^2ε^2})$ sample complexity and $O(\frac{κ^3}{(1-λ)^2ε^2})$ communication complexity for the nonconvex-strongly-concave minimax problem. As far as we know, our work is the first one to achieve such theoretical complexities for this kind of minimax problem. At last, we apply our method to AUC maximization, and the experimental results confirm the effectiveness of our method.