Shao-Ting Chiu

h-index1

4papers

4citations

Novelty61%

AI Score43

Ranked #51,594 of 194,257 authors (top 27%)#11,807 in LG (top 29%)

4 Papers

4.1LGDec 19, 2025

BumpNet: A Sparse MLP Framework for Learning PDE Solutions

Shao-Ting Chiu, Ioannis G. Kevrekidis, Ulisses Braga-Neto

We introduce BumpNet, a sparse multilayer perceptron (MLP) framework for PDE numerical solution and operator learning. BumpNet is based on basis function expansion, which makes them superficially similar to radial-basis function (RBF) networks. However, the basis functions in BumpNet are constructed from ordinary sigmoid activation functions in a sparse multi-layer framework. This makes BumpNet a MLP, not a RBF neural network, enabling the efficient use of modern training techniques optimized for MLPs. All parameters of the basis functions, including shape, location, and amplitude, are fully trainable. Model parsimony is encouraged through a basis function pruning scheme. BumpNet is a general meshless framework that can be combined with existing neural architectures for learning PDE solutions: here, we propose Bump-PINNs (BumpNet with physics-informed neural networks) for solving general PDEs; Bump-EDNN (BumpNet with evolutionary deep neural networks) to solve time-evolution PDEs; and Bump-DeepONet (BumpNet with deep operator networks) for PDE operator learning. We prove that BumpNets and Bump-DeepONets are universal approximators of continuous functions and continuous operators, respectively. Bump-PINNs are trained using the same collocation-based approach used by PINNs; Bump-EDNN uses a BumpNet only in the spatial domain and uses EDNNs to advance the solution in time; while Bump-DeepONets employ a BumpNet regression network as the trunk network of a DeepONet. Extensive numerical experiments demonstrate the efficiency and accuracy of BumpNets.

7.1LGDec 18, 2025

In-Context Multi-Operator Learning with DeepOSets

Shao-Ting Chiu, Aditya Nambiar, Ali Syed et al.

An important application of neural networks to scientific computing has been the learning of non-linear operators. In this framework, a neural network is trained to fit a non-linear map between two infinite dimensional spaces, for example, the solution operator of ordinary and partial differential equations. Recently, inspired by the discovery of in-context learning for large language models, an even more ambitious paradigm has been explored, called multi-operator learning. In this approach, a neural network is trained to learn many different operators at the same time. In order to evaluate one of the learned operators, the network is passed example inputs and outputs to disambiguate the desired operator. In this work, we provide a precise mathematical formulation of the multi-operator learning problem. In addition, we modify a simple efficient architecture, called DeepOSets, for multi-operator learning and prove its universality for multi-operator learning. Finally, we provide a comprehensive set of experiments that demonstrate the ability of DeepOSets to learn multiple operators corresponding to different initial-value and boundary-value differential equations and use in-context examples to predict accurately the solutions corresponding to queries and differential equations not seen during training. The main advantage of DeepOSets is its architectural simplicity, which allows the derivation of theoretical guarantees and training times that are in the order of minutes, in contrast to similar transformer-based alternatives that are empirically justified and require hours of training.

4.9LGJan 12

Free-RBF-KAN: Kolmogorov-Arnold Networks with Adaptive Radial Basis Functions for Efficient Function Learning

Shao-Ting Chiu, Siu Wun Cheung, Ulisses Braga-Neto et al.

Kolmogorov-Arnold Networks (KANs) have shown strong potential for efficiently approximating complex nonlinear functions. However, the original KAN formulation relies on B-spline basis functions, which incur substantial computational overhead due to De Boor's algorithm. To address this limitation, recent work has explored alternative basis functions such as radial basis functions (RBFs) that can improve computational efficiency and flexibility. Yet, standard RBF-KANs often sacrifice accuracy relative to the original KAN design. In this work, we propose Free-RBF-KAN, a RBF-based KAN architecture that incorporates adaptive learning grids and trainable smoothness to close this performance gap. Our method employs freely learnable RBF shapes that dynamically align grid representations with activation patterns, enabling expressive and adaptive function approximation. Additionally, we treat smoothness as a kernel parameter optimized jointly with network weights, without increasing computational complexity. We provide a general universality proof for RBF-KANs, which encompasses our Free-RBF-KAN formulation. Through a broad set of experiments, including multiscale function approximation, physics-informed machine learning, and PDE solution operator learning, Free-RBF-KAN achieves accuracy comparable to the original B-spline-based KAN while delivering faster training and inference. These results highlight Free-RBF-KAN as a compelling balance between computational efficiency and adaptive resolution, particularly for high-dimensional structured modeling tasks.

2.6LGOct 11, 2024

DeepOSets: Non-Autoregressive In-Context Learning with Permutation-Invariance Inductive Bias

Shao-Ting Chiu, Junyuan Hong, Ulisses Braga-Neto

In-context learning (ICL) is the remarkable ability displayed by some machine learning models to learn from examples provided in a user prompt without any model parameter updates. ICL was first observed in the domain of large language models, and it has been widely assumed that it is a product of the attention mechanism in autoregressive transformers. In this paper, using stylized regression learning tasks, we demonstrate that ICL can emerge in a non-autoregressive neural architecture with a hard-coded permutation-invariance inductive bias. This novel architecture, called DeepOSets, combines the set learning properties of the DeepSets architecture with the operator learning capabilities of Deep Operator Networks (DeepONets). We provide a representation theorem for permutation-invariant regression learning operators and prove that DeepOSets are universal approximators of this class of operators. We performed comprehensive numerical experiments to evaluate the capabilities of DeepOSets in learning linear, polynomial, and shallow neural network regression, under varying noise levels, dimensionalities, and sample sizes. In the high-dimensional regime, accuracy was enhanced by replacing the DeepSets layer with a Set Transformer. Our results show that DeepOSets deliver accurate and fast results with an order of magnitude fewer parameters than a comparable transformer-based alternative.