TADIL: Task-Agnostic Domain-Incremental Learning through Task-ID Inference using Transformer Nearest-Centroid Embeddings
This addresses the challenge of continual learning for models handling non-i.i.d. data in real-world applications like autonomous driving, though it is incremental as it builds on existing transformer and clustering methods.
The paper tackles the problem of identifying tasks in domain-incremental learning without supervision by proposing a pipeline that uses transformer embeddings and nearest-centroid clustering, achieving performance matching ground-truth approaches on the SODA10M driving dataset.
Machine Learning (ML) models struggle with data that changes over time or across domains due to factors such as noise, occlusion, illumination, or frequency, unlike humans who can learn from such non independent and identically distributed data. Consequently, a Continual Learning (CL) approach is indispensable, particularly, Domain-Incremental Learning. In this paper, we propose a novel pipeline for identifying tasks in domain-incremental learning scenarios without supervision. The pipeline comprises four steps. First, we obtain base embeddings from the raw data using an existing transformer-based model. Second, we group the embedding densities based on their similarity to obtain the nearest points to each cluster centroid. Third, we train an incremental task classifier using only these few points. Finally, we leverage the lightweight computational requirements of the pipeline to devise an algorithm that decides in an online fashion when to learn a new task using the task classifier and a drift detector. We conduct experiments using the SODA10M real-world driving dataset and several CL strategies. We demonstrate that the performance of these CL strategies with our pipeline can match the ground-truth approach, both in classical experiments assuming task boundaries, and also in more realistic task-agnostic scenarios that require detecting new tasks on-the-fly