LGNEJun 4, 2024

ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU

arXiv:2406.02075v243 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using KANs, offering a more efficient implementation, though it is incremental as it builds on the existing KAN framework.

The paper tackled the limited parallel computing capability of Kolmogorov-Arnold Networks (KAN) on GPUs by proposing ReLU-KAN, which simplifies basis functions using ReLU and point-wise multiplication, resulting in a 20x speedup compared to traditional KAN with 4-layer networks and more stable training.

Limited by the complexity of basis function (B-spline) calculations, Kolmogorov-Arnold Networks (KAN) suffer from restricted parallel computing capability on GPUs. This paper proposes a novel ReLU-KAN implementation that inherits the core idea of KAN. By adopting ReLU (Rectified Linear Unit) and point-wise multiplication, we simplify the design of KAN's basis function and optimize the computation process for efficient CUDA computing. The proposed ReLU-KAN architecture can be readily implemented on existing deep learning frameworks (e.g., PyTorch) for both inference and training. Experimental results demonstrate that ReLU-KAN achieves a 20x speedup compared to traditional KAN with 4-layer networks. Furthermore, ReLU-KAN exhibits a more stable training process with superior fitting ability while preserving the "catastrophic forgetting avoidance" property of KAN. You can get the code in https://github.com/quiqi/relu_kan

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes