LG AIMar 12, 2021

Continual Learning for Recurrent Neural Networks: an Empirical Evaluation

Andrea Cossu, Antonio Carta, Vincenzo Lomonaco, Davide Bacciu

arXiv:2103.07492v420.1124 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for standardized benchmarks and evaluation in continual learning for sequential data, which is incremental as it builds on existing methods without introducing new paradigms.

The paper tackles the problem of fragmented and application-specific approaches in continual learning for recurrent neural networks by organizing literature, proposing new benchmarks, and conducting an empirical evaluation. The results show that sequence length and clear scenario specification are key factors, with specific strategies tested to mitigate forgetting in class-incremental scenarios.

Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.

View on arXiv PDF

Similar