LGAICRNov 20, 2023

BrainWash: A Poisoning Attack to Forget in Continual Learning

arXiv:2311.11995v39 citationsh-index: 31
Originality Highly original
AI Analysis

This addresses a security vulnerability in continual learning systems, which is an incremental but important problem for AI safety and robustness.

The paper tackles the susceptibility of continual learning to adversarial attacks by introducing BrainWash, a data poisoning method that induces catastrophic forgetting in trained continual learners, demonstrating performance degradation across various baselines.

Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the BrainWash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of BrainWash, showcasing degradation in performance across various regularization-based continual learning methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes