LGAISep 26, 2023

Revisiting Softmax Masking: Stop Gradient for Enhancing Stability in Replay-based Continual Learning

arXiv:2309.14808v2h-index: 9
Originality Incremental advance
AI Analysis

This work addresses catastrophic forgetting for continual learning systems, offering an incremental improvement to existing replay-based methods.

The paper tackles catastrophic forgetting in replay-based continual learning by analyzing softmax masking and proposing a general masked softmax that adjusts gradient scales to improve stability. The method enhances performance on benchmarks, even with extremely small buffer sizes.

In replay-based methods for continual learning, replaying input samples in episodic memory has shown its effectiveness in alleviating catastrophic forgetting. However, the potential key factor of cross-entropy loss with softmax in causing catastrophic forgetting has been underexplored. In this paper, we analyze the effect of softmax and revisit softmax masking with negative infinity to shed light on its ability to mitigate catastrophic forgetting. Based on the analyses, it is found that negative infinity masked softmax is not always compatible with dark knowledge. To improve the compatibility, we propose a general masked softmax that controls the stability by adjusting the gradient scale to old and new classes. We demonstrate that utilizing our method on other replay-based methods results in better performance, primarily by enhancing model stability in continual learning benchmarks, even when the buffer size is set to an extremely small value.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes