LGJul 21, 2025

Machine Unlearning for Streaming Forgetting

arXiv:2507.15280v1h-index: 5ECAI
Originality Highly original
AI Analysis

This addresses the practical challenge of streaming data removal for machine learning models, which is incremental as it builds on existing unlearning methods by adapting them to sequential requests.

The paper tackles the problem of machine unlearning in streaming scenarios where data removal requests arrive sequentially, rather than in batches, by formalizing it as a distribution shift and proposing an algorithm that achieves efficient forgetting without access to original training data, with theoretical error bounds and experimental validation across models and datasets.

Machine unlearning aims to remove knowledge of the specific training data in a well-trained model. Currently, machine unlearning methods typically handle all forgetting data in a single batch, removing the corresponding knowledge all at once upon request. However, in practical scenarios, requests for data removal often arise in a streaming manner rather than in a single batch, leading to reduced efficiency and effectiveness in existing methods. Such challenges of streaming forgetting have not been the focus of much research. In this paper, to address the challenges of performance maintenance, efficiency, and data access brought about by streaming unlearning requests, we introduce a streaming unlearning paradigm, formalizing the unlearning as a distribution shift problem. We then estimate the altered distribution and propose a novel streaming unlearning algorithm to achieve efficient streaming forgetting without requiring access to the original training data. Theoretical analyses confirm an $O(\sqrt{T} + V_T)$ error bound on the streaming unlearning regret, where $V_T$ represents the cumulative total variation in the optimal solution over $T$ learning rounds. This theoretical guarantee is achieved under mild conditions without the strong restriction of convex loss function. Experiments across various models and datasets validate the performance of our proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes