Hierarchically Gated Experts for Efficient Online Continual Learning
This work addresses the challenge of catastrophic forgetting in online continual learning, which is incremental as it builds on existing methods to enhance efficiency.
The authors tackled the problem of online continual learning, where tasks are unknown and data arrives as a single stream, by proposing the Hierarchically Gated Experts (HGE) method, which organizes experts hierarchically to efficiently select the best expert for each sample, achieving results comparable to current methods with improved efficiency.
Continual Learning models aim to learn a set of tasks under the constraint that the tasks arrive sequentially with no way to access data from previous tasks. The Online Continual Learning framework poses a further challenge where the tasks are unknown and instead the data arrives as a single stream. Building on existing work, we propose a method for identifying these underlying tasks: the Gated Experts (GE) algorithm, where a dynamically growing set of experts allows for new knowledge to be acquired without catastrophic forgetting. Furthermore, we extend GE to Hierarchically Gated Experts (HGE), a method which is able to efficiently select the best expert for each data sample by organising the experts into a hierarchical structure. On standard Continual Learning benchmarks, GE and HGE are able to achieve results comparable with current methods, with HGE doing so more efficiently.