LGSep 3, 2025

Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study

arXiv:2509.03417v18 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This work addresses an incremental but practical problem for researchers and practitioners using KANs, providing empirical guidance on initialization to improve performance.

The authors tackled the problem of initialization strategies for Kolmogorov-Arnold Networks (KANs), which lack established methods, by proposing theory-driven and empirical approaches and evaluating them on function fitting, PDE benchmarks, and the Feynman dataset. Their results show that Glorot-inspired initialization outperforms baselines in parameter-rich models, while power-law initialization achieves the strongest overall performance across tasks and architectures.

Kolmogorov-Arnold Networks (KANs) are a recently introduced neural architecture that replace fixed nonlinearities with trainable activation functions, offering enhanced flexibility and interpretability. While KANs have been applied successfully across scientific and machine learning tasks, their initialization strategies remain largely unexplored. In this work, we study initialization schemes for spline-based KANs, proposing two theory-driven approaches inspired by LeCun and Glorot, as well as an empirical power-law family with tunable exponents. Our evaluation combines large-scale grid searches on function fitting and forward PDE benchmarks, an analysis of training dynamics through the lens of the Neural Tangent Kernel, and evaluations on a subset of the Feynman dataset. Our findings indicate that the Glorot-inspired initialization significantly outperforms the baseline in parameter-rich models, while power-law initialization achieves the strongest performance overall, both across tasks and for architectures of varying size. All code and data accompanying this manuscript are publicly available at https://github.com/srigas/KAN_Initialization_Schemes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes