LG CVJun 9, 2021

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Julio Hurtado, Alain Raymond-Saez, Alvaro Soto

arXiv:2106.05390v316.843 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of catastrophic forgetting for AI systems that learn tasks sequentially, representing a strong incremental improvement over existing methods.

The paper tackles catastrophic forgetting in continual learning by proposing MARK, a method that maintains a reusable knowledge base and uses metalearning and trainable masks to selectively reuse weights across tasks. It achieves state-of-the-art results, including over 10% higher average accuracy on the 20-Split-MiniImageNet dataset and near-zero forgetfulness with 55% of the parameters.

When learning tasks over time, artificial neural networks suffer from a problem known as Catastrophic Forgetting (CF). This happens when the weights of a network are overwritten during the training of a new task causing forgetting of old information. To address this issue, we propose MetA Reusable Knowledge or MARK, a new method that fosters weight reusability instead of overwriting when learning a new task. Specifically, MARK keeps a set of shared weights among tasks. We envision these shared weights as a common Knowledge Base (KB) that is not only used to learn new tasks, but also enriched with new knowledge as the model learns new tasks. Key components behind MARK are two-fold. On the one hand, a metalearning approach provides the key mechanism to incrementally enrich the KB with new knowledge and to foster weight reusability among tasks. On the other hand, a set of trainable masks provides the key mechanism to selectively choose from the KB relevant weights to solve each task. By using MARK, we achieve state of the art results in several popular benchmarks, surpassing the best performing methods in terms of average accuracy by over 10% on the 20-Split-MiniImageNet dataset, while achieving almost zero forgetfulness using 55% of the number of parameters. Furthermore, an ablation study provides evidence that, indeed, MARK is learning reusable knowledge that is selectively used by each task.

View on arXiv PDF Code

Similar