Pablo García-Santaclara

h-index1

3papers

7citations

Novelty48%

AI Score41

Ranked #65,315 of 194,257 authors (top 34%)#897 in ML (top 27%)

3 Papers

10.4LGJul 12, 2024

Overcoming Catastrophic Forgetting in Tabular Data Classification: A Pseudorehearsal-based approach

Pablo García-Santaclara, Bruno Fernández-Castro, Rebeca P. Díaz-Redondo

Continual learning (CL) poses the important challenge of adapting to evolving data distributions without forgetting previously acquired knowledge while consolidating new knowledge. In this paper, we introduce a new methodology, coined as Tabular-data Rehearsal-based Incremental Lifelong Learning framework (TRIL3), designed to address the phenomenon of catastrophic forgetting in tabular data classification problems. TRIL3 uses the prototype-based incremental generative model XuILVQ to generate synthetic data to preserve old knowledge and the DNDF algorithm, which was modified to run in an incremental way, to learn classification tasks for tabular data, without storing old samples. After different tests to obtain the adequate percentage of synthetic data and to compare TRIL3 with other CL available proposals, we can conclude that the performance of TRIL3 outstands other options in the literature using only 50% of synthetic data.

6.9MLJun 2

Combining Statistical Features and Deep Encodings for Rehearsal-Based Class-Incremental Time Series Classification

Pablo García-Santaclara, Bruno Fernández-Castro, Rebeca Pilar Díaz-Redondo

Many systems used in real-world environments require adding new categories and incorporating new information without forgetting what was previously learnt by the classification model. This is known as class-incremental continual learning, and in the case of multivariate time-series, is further complicated by the temporal structure of the data. In this paper, we present a novel approach for performing class incremental continual learning for the classification of multivariate time series data based upon the construction of a dual-stream feature extraction pipeline (using both deep temporal embedding features generated via a pre-trained frozen foundation model and application of statistical features). Evaluated on five benchmark datasets, the proposed system achieves competitive average accuracy across all datasets while maintaining low forgetting rates across all experimental configurations.

1.7MLFeb 10

Continual Learning for non-stationary regression via Memory-Efficient Replay

Pablo García-Santaclara, Bruno Fernández-Castro, RebecaP. Díaz-Redondo et al.

Data streams are rarely static in dynamic environments like Industry 4.0. Instead, they constantly change, making traditional offline models outdated unless they can quickly adjust to the new data. This need can be adequately addressed by continual learning (CL), which allows systems to gradually acquire knowledge without incurring the prohibitive costs of retraining them from scratch. Most research on continual learning focuses on classification problems, while very few studies address regression tasks. We propose the first prototype-based generative replay framework designed for online task-free continual regression. Our approach defines an adaptive output-space discretization model, enabling prototype-based generative replay for continual regression without storing raw data. Evidence obtained from several benchmark datasets shows that our framework reduces forgetting and provides more stable performance than other state-of-the-art solutions.