Mo' Memory, Mo' Problems: Stream-Native Machine Unlearning
This addresses the need for efficient, continuous unlearning in production ML systems, extending model lifespan before costly retraining.
The paper tackles the problem of machine unlearning in dynamic, non-i.i.d. data streams by developing an online algorithm that achieves logarithmic regret bounds (O(ln T)), a first for certified unlearning, while maintaining constant memory usage with an online L-BFGS variant.
Machine unlearning work assumes a static, i.i.d training environment that doesn't truly exist. Modern ML pipelines need to learn, unlearn, and predict continuously on production streams of data. We translate batch unlearning to the online setting using notions of regret, sample complexity, and deletion capacity. We tighten regret bounds to a logarithmic $\mathcal{O}(\ln{T})$, a first for a certified unlearning algorithm. When fitted with an online variant of L-BFGS optimization, the algorithm achieves state of the art regret with a constant memory footprint. Such changes extend the lifespan of an ML model before expensive retraining, making for a more efficient unlearning process.