LGJan 13, 2025

PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks

Hoang-Thang Ta, Duy-Quy Thai, Anh Tran, Grigori Sidorov, Alexander Gelbukh

arXiv:2501.07032v415.75 citationsh-index: 21Has Code

Originality Incremental advance

AI Analysis

This work addresses a parameter efficiency problem for researchers and practitioners using KANs, representing an incremental improvement.

The paper tackles the high parameter count in Kolmogorov-Arnold Networks (KANs) by introducing PRKANs, which reduce parameters to be comparable to MLPs while outperforming existing KANs on MNIST and Fashion-MNIST datasets.

Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures, offering a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformers. By advancing network design, KANs drive groundbreaking research and enable transformative applications across various scientific domains involving neural networks. However, existing KANs often require significantly more parameters in their network layers than MLPs. To address this limitation, this paper introduces PRKANs (Parameter-Reduced Kolmogorov-Arnold Networks), which employ several methods to reduce the parameter count in KAN layers, making them comparable to MLP layers. Experimental results on the MNIST and Fashion-MNIST datasets demonstrate that PRKANs outperform several existing KANs, and their variant with attention mechanisms rivals the performance of MLPs, albeit with slightly longer training times. Furthermore, the study highlights the advantages of Gaussian Radial Basis Functions (GRBFs) and layer normalization in KAN designs. The repository for this work is available at: https://github.com/hoangthangta/All-KAN.

View on arXiv PDF Code

Similar