CLAug 16, 2024

FourierKAN outperforms MLP on Text Classification Head Fine-tuning

arXiv:2408.08803v210 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the need for more efficient and effective classification heads in NLP, offering a lightweight solution that could impact various downstream tasks, though it is incremental in nature.

The paper tackles the problem of fine-tuning classification heads in resource-constrained settings by proposing Fourier KAN (FR-KAN) as an alternative to MLPs, achieving an average improvement of 10% in accuracy and 11% in F1-score across multiple models and tasks.

In resource constraint settings, adaptation to downstream classification tasks involves fine-tuning the final layer of a classifier (i.e. classification head) while keeping rest of the model weights frozen. Multi-Layer Perceptron (MLP) heads fine-tuned with pre-trained transformer backbones have long been the de facto standard for text classification head fine-tuning. However, the fixed non-linearity of MLPs often struggles to fully capture the nuances of contextual embeddings produced by pre-trained models, while also being computationally expensive. In our work, we investigate the efficacy of KAN and its variant, Fourier KAN (FR-KAN), as alternative text classification heads. Our experiments reveal that FR-KAN significantly outperforms MLPs with an average improvement of 10% in accuracy and 11% in F1-score across seven pre-trained transformer models and four text classification tasks. Beyond performance gains, FR-KAN is more computationally efficient and trains faster with fewer parameters. These results underscore the potential of FR-KAN to serve as a lightweight classification head, with broader implications for advancing other Natural Language Processing (NLP) tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes