MLITLGFeb 5, 2021

Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels

arXiv:2102.02976v421 citations
Originality Incremental advance
AI Analysis

This work provides theoretical generalization bounds for noisy iterative algorithms, which is important for researchers and practitioners using methods like DP-SGD and federated learning.

This paper analyzes the generalization of models trained by noisy iterative algorithms by connecting them to additive noise channels. The authors derive distribution-dependent generalization bounds that are applicable to DP-SGD, federated learning, and SGLD, and show they align with empirical observations in neural networks.

Machine learning models trained by different optimization algorithms under different data distributions can exhibit distinct generalization behaviors. In this paper, we analyze the generalization of models trained by noisy iterative algorithms. We derive distribution-dependent generalization bounds by connecting noisy iterative algorithms to additive noise channels found in communication and information theory. Our generalization bounds shed light on several applications, including differentially private stochastic gradient descent (DP-SGD), federated learning, and stochastic gradient Langevin dynamics (SGLD). We demonstrate our bounds through numerical experiments, showing that they can help understand recent empirical observations of the generalization phenomena of neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes