Scalable Bayesian Physics-Informed Kolmogorov-Arnold Networks
This addresses computational bottlenecks in uncertainty quantification for scientific machine learning applications, though it appears incremental as it builds on existing Kolmogorov-Arnold networks and ensemble methods.
The paper tackles computational inefficiency and overfitting in uncertainty quantification for scientific machine learning by proposing a gradient-free method combining dropout Tikhonov ensemble Kalman inversion with Chebyshev Kolmogorov-Arnold networks, achieving comparable or better accuracy with much higher efficiency and stability than Hamiltonian Monte Carlo while preserving accuracy through parameter-space reduction.
Uncertainty quantification (UQ) plays a pivotal role in scientific machine learning, especially when surrogate models are used to approximate complex systems. Although multilayer perceptions (MLPs) are commonly employed as surrogates, they often suffer from overfitting due to their large number of parameters. Kolmogorov-Arnold networks (KANs) offer an alternative solution with fewer parameters. However, gradient-based inference methods, such as Hamiltonian Monte Carlo (HMC), may result in computational inefficiency when applied to KANs, especially for large-scale datasets, due to the high cost of back-propagation. To address these challenges, we propose a novel approach, combining the dropout Tikhonov ensemble Kalman inversion (DTEKI) with Chebyshev KANs. This gradient-free method effectively mitigates overfitting and enhances numerical stability. Additionally, we incorporate the active subspace method to reduce the parameter-space dimensionality, allowing us to improve the accuracy of predictions and obtain more reliable uncertainty estimates. Extensive experiments demonstrate the efficacy of our approach in various test cases, including scenarios with large datasets and high noise levels. Our results show that the new method achieves comparable or better accuracy, much higher efficiency as well as stability compared to HMC, in addition to scalability. Moreover, by leveraging the low-dimensional parameter subspace, our method preserves prediction accuracy while substantially reducing further the computational cost.