CVJul 5, 2025

Exploring Kolmogorov-Arnold Network Expansions in Vision Transformers for Mitigating Catastrophic Forgetting in Continual Learning

arXiv:2507.04020v1h-index: 3Mathematics
Originality Incremental advance
AI Analysis

This addresses the problem of catastrophic forgetting in continual learning for vision transformers, offering a novel integration that is incremental but promising for dynamic environments.

The research tackled catastrophic forgetting in continual learning for vision transformers by replacing MLPs with Kolmogorov-Arnold Networks (KANs), resulting in significant mitigation of forgetting and outperforming traditional MLP-based ViTs on benchmark datasets like MNIST and CIFAR100.

Continual learning (CL), the ability of a model to learn new tasks without forgetting previously acquired knowledge, remains a critical challenge in artificial intelligence, particularly for vision transformers (ViTs) utilizing Multilayer Perceptrons (MLPs) for global representation learning. Catastrophic forgetting, where new information overwrites prior knowledge, is especially problematic in these models. This research proposes replacing MLPs in ViTs with Kolmogorov-Arnold Network (KANs) to address this issue. KANs leverage local plasticity through spline-based activations, ensuring that only a subset of parameters is updated per sample, thereby preserving previously learned knowledge. The study investigates the efficacy of KAN-based ViTs in CL scenarios across benchmark datasets (MNIST, CIFAR100), focusing on their ability to retain accuracy on earlier tasks while adapting to new ones. Experimental results demonstrate that KAN-based ViTs significantly mitigate catastrophic forgetting, outperforming traditional MLP-based ViTs in knowledge retention and task adaptation. This novel integration of KANs into ViTs represents a promising step toward more robust and adaptable models for dynamic environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes