LGJul 14, 2025

On the Performance of Differentially Private Optimization with Heavy-Tail Class Imbalance

arXiv:2507.10536v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses performance degradation in private learning for imbalanced datasets, which is an incremental but important domain-specific problem.

The paper tackles the problem of differentially private optimization under heavy-tail class imbalance, showing that DP-GD suffers when learning low-frequency classes while algorithms using second-order information like DP-AdamBC avoid this issue, achieving ≈8% and ≈5% accuracy improvements on least frequent classes in experiments.

In this work, we analyze the optimization behaviour of common private learning optimization algorithms under heavy-tail class imbalanced distribution. We show that, in a stylized model, optimizing with Gradient Descent with differential privacy (DP-GD) suffers when learning low-frequency classes, whereas optimization algorithms that estimate second-order information do not. In particular, DP-AdamBC that removes the DP bias from estimating loss curvature is a crucial component to avoid the ill-condition caused by heavy-tail class imbalance, and empirically fits the data better with $\approx8\%$ and $\approx5\%$ increase in training accuracy when learning the least frequent classes on both controlled experiments and real data respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes