LG AIJul 12, 2024

Overcoming Catastrophic Forgetting in Tabular Data Classification: A Pseudorehearsal-based approach

Pablo García-Santaclara, Bruno Fernández-Castro, Rebeca P. Díaz-Redondo

arXiv:2407.09039v110.47 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the problem of forgetting old knowledge in evolving tabular data for machine learning practitioners, though it is incremental as it adapts existing methods to a specific data type.

The paper tackles catastrophic forgetting in continual learning for tabular data classification by introducing TRIL3, a framework that uses synthetic data generation and incremental algorithms, achieving performance that outperforms other methods with only 50% synthetic data.

Continual learning (CL) poses the important challenge of adapting to evolving data distributions without forgetting previously acquired knowledge while consolidating new knowledge. In this paper, we introduce a new methodology, coined as Tabular-data Rehearsal-based Incremental Lifelong Learning framework (TRIL3), designed to address the phenomenon of catastrophic forgetting in tabular data classification problems. TRIL3 uses the prototype-based incremental generative model XuILVQ to generate synthetic data to preserve old knowledge and the DNDF algorithm, which was modified to run in an incremental way, to learn classification tasks for tabular data, without storing old samples. After different tests to obtain the adequate percentage of synthetic data and to compare TRIL3 with other CL available proposals, we can conclude that the performance of TRIL3 outstands other options in the literature using only 50% of synthetic data.

View on arXiv PDF

Similar