LGApr 26, 2022

Theoretical Understanding of the Information Flow on Continual Learning Performance

arXiv:2204.12010v210.49 citationsh-index: 11

Originality Incremental advance

AI Analysis

This work addresses the lack of theoretical understanding in continual learning, offering insights to improve model retention and performance for AI systems that learn sequentially, though it is incremental in nature.

The paper tackles the problem of catastrophic forgetting in continual learning by establishing a probabilistic framework to analyze information flow between network layers, showing that optimizing this flow improves performance across multiple tasks with empirical evidence.

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data sequentially. CL performance evaluates the model's ability to continually learn and solve new problems with incremental available information over time while retaining previous knowledge. Despite the numerous previous solutions to bypass the catastrophic forgetting (CF) of previously seen tasks during the learning process, most of them still suffer significant forgetting, expensive memory cost, or lack of theoretical understanding of neural networks' conduct while learning new tasks. While the issue that CL performance degrades under different training regimes has been extensively studied empirically, insufficient attention has been paid from a theoretical angle. In this paper, we establish a probabilistic framework to analyze information flow through layers in networks for task sequences and its impact on learning performance. Our objective is to optimize the information preservation between layers while learning new tasks to manage task-specific knowledge passing throughout the layers while maintaining model performance on previous tasks. In particular, we study CL performance's relationship with information flow in the network to answer the question "How can knowledge of information flow between layers be used to alleviate CF?". Our analysis provides novel insights of information adaptation within the layers during the incremental task learning process. Through our experiments, we provide empirical evidence and practically highlight the performance improvement across multiple tasks.

View on arXiv PDF

Similar