MLLGOct 15, 2025

Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach

arXiv:2510.13094v14 citationsh-index: 32
Originality Highly original
AI Analysis

This addresses a theoretical bottleneck in machine unlearning for high-dimensional data, offering a more efficient solution than existing approaches.

The paper tackles the problem of machine unlearning in high-dimensional settings where standard optimization assumptions fail, introducing ε-Gaussian certifiability as a robust notion. The result shows that a single Newton step with calibrated Gaussian noise achieves both privacy and accuracy, contrasting with prior work that required at least two steps.

Machine unlearning seeks to efficiently remove the influence of selected data while preserving generalization. Significant progress has been made in low dimensions $(p \ll n)$, but high dimensions pose serious theoretical challenges as standard optimization assumptions of $Ω(1)$ strong convexity and $O(1)$ smoothness of the per-example loss $f$ rarely hold simultaneously in proportional regimes $(p\sim n)$. In this work, we introduce $\varepsilon$-Gaussian certifiability, a canonical and robust notion well-suited to high-dimensional regimes, that optimally captures a broad class of noise adding mechanisms. Then we theoretically analyze the performance of a widely used unlearning algorithm based on one step of the Newton method in the high-dimensional setting described above. Our analysis shows that a single Newton step, followed by a well-calibrated Gaussian noise, is sufficient to achieve both privacy and accuracy in this setting. This result stands in sharp contrast to the only prior work that analyzes machine unlearning in high dimensions \citet{zou2025certified}, which relaxes some of the standard optimization assumptions for high-dimensional applicability, but operates under the notion of $\varepsilon$-certifiability. That work concludes %that a single Newton step is insufficient even for removing a single data point, and that at least two steps are required to ensure both privacy and accuracy. Our result leads us to conclude that the discrepancy in the number of steps arises because of the sub optimality of the notion of $\varepsilon$-certifiability and its incompatibility with noise adding mechanisms, which $\varepsilon$-Gaussian certifiability is able to overcome optimally.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes