Optimized Architectures for Kolmogorov-Arnold Networks
This work addresses the tension between expressiveness and interpretability in scientific machine learning, offering a principled solution for researchers and practitioners in that domain.
The paper tackled the problem of improving Kolmogorov-Arnold networks (KANs) by addressing the trade-off between architectural enhancements and interpretability, resulting in models that achieve competitive or superior accuracy while being substantially smaller through overprovisioning and sparsification.
Efforts to improve Kolmogorov-Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined with sparsification to learn compact, interpretable KANs without sacrificing accuracy. Crucially, we focus on differentiable sparsification, turning architecture search into an end-to-end optimization problem. Across function approximation benchmarks, dynamical systems forecasting, and real-world prediction tasks, we demonstrate competitive or superior accuracy while discovering substantially smaller models. Overprovisioning and sparsification are synergistic, with the combination outperforming either alone. The result is a principled path toward models that are both more expressive and more interpretable, addressing a key tension in scientific machine learning.