Scalable Recollections for Continual Lifelong Learning
This addresses the efficiency bottleneck in continual lifelong learning, which is crucial for real-world applications but often overlooked, making it an incremental improvement.
The paper tackles the problem of efficient storage of experiences in continual lifelong learning, where models must learn from a continuous stream of non-stationary data with limited memory, and achieves considerable gains over state-of-the-art methods like GEM.
Given the recent success of Deep Learning applied to a variety of single tasks, it is natural to consider more human-realistic settings. Perhaps the most difficult of these settings is that of continual lifelong learning, where the model must learn online over a continuous stream of non-stationary data. A successful continual lifelong learning system must have three key capabilities: it must learn and adapt over time, it must not forget what it has learned, and it must be efficient in both training time and memory. Recent techniques have focused their efforts primarily on the first two capabilities while questions of efficiency remain largely unexplored. In this paper, we consider the problem of efficient and effective storage of experiences over very large time-frames. In particular we consider the case where typical experiences are O(n) bits and memories are limited to O(k) bits for k << n. We present a novel scalable architecture and training algorithm in this challenging domain and provide an extensive evaluation of its performance. Our results show that we can achieve considerable gains on top of state-of-the-art methods such as GEM.