Y. Sungtaek Ju

h-index6

5papers

2citations

Novelty60%

AI Score50

Ranked #42,224 of 201,326 authors (top 21%)#9,627 in LG (top 23%)

5 Papers

LGFeb 3

SymPlex: A Structure-Aware Transformer for Symbolic PDE Solving

Yesom Park, Annie C. Lu, Shao-Ching Huang et al.

We propose SymPlex, a reinforcement learning framework for discovering analytical symbolic solutions to partial differential equations (PDEs) without access to ground-truth expressions. SymPlex formulates symbolic PDE solving as tree-structured decision-making and optimizes candidate solutions using only the PDE and its boundary conditions. At its core is SymFormer, a structure-aware Transformer that models hierarchical symbolic dependencies via tree-relative self-attention and enforces syntactic validity through grammar-constrained autoregressive decoding, overcoming the limited expressivity of sequence-based generators. Unlike numerical and neural approaches that approximate solutions in discretized or implicit function spaces, SymPlex operates directly in symbolic expression space, enabling interpretable and human-readable solutions that naturally represent non-smooth behavior and explicit parametric dependence. Empirical results demonstrate exact recovery of non-smooth and parametric PDE solutions using deep learning-based symbolic methods.

CHEM-PHFeb 12

Spectral Homogenization of the Radiative Transfer Equation via Low-Rank Tensor Train Decomposition

Y. Sungtaek Ju

Radiative transfer in absorbing-scattering media requires solving a transport equation across a spectral domain with 10^5 - 10^6 molecular absorption lines. Line-by-line (LBL) computation is prohibitively expensive, while existing approximations sacrifice spectral fidelity. We show that the Young-measure homogenization framework produces solution tensors I that admit low-rank tensor-train (TT) decompositions whose bond dimensions remain bounded as the spectral resolution Ns increases. Using molecular line parameters from the HITRAN database for H2O and CO2, we demonstrate that: (i) the TT rank saturates at r = 8 (at tolerance e = 10^-6) from Ns = 16 to 4096, independent of single-scattering albedo, Henyey-Greenstein asymmetry, temperature, and pressure; (ii) quantized tensor-train (QTT) representations achieve sub-linear storage scaling; (iii) in a controlled comparison using identical opacity data and transport solver, the homogenized approach achieves over an order of magnitude lower L2 error than the correlated-k distribution at equal cost; and (iv) for atomic plasma opacity (aluminum at 60 eV, TOPS database), the TT rank saturates at r = 15 with fundamentally different spectral structure (bound-bound and bound-free transitions spanning 12 decades of dynamic range), confirming that rank boundedness is a property of the transport equation rather than any particular opacity source. These results establish that the spectral complexity of radiative transfer has a finite effective rank exploitable by tensor decomposition, complementing the spatial-angular compression achieved by existing TT and dynamical low-rank approaches.

LGDec 4, 2025

Uncertainty Quantification for Scientific Machine Learning using Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KAN)

Y. Sungtaek Ju

Kolmogorov-Arnold Networks have emerged as interpretable alternatives to traditional multi-layer perceptrons. However, standard implementations lack principled uncertainty quantification capabilities essential for many scientific applications. We present a framework integrating sparse variational Gaussian process inference with the Kolmogorov-Arnold topology, enabling scalable Bayesian inference with computational complexity quasi-linear in sample size. Through analytic moment matching, we propagate uncertainty through deep additive structures while maintaining interpretability. We use three example studies to demonstrate the framework's ability to distinguish aleatoric from epistemic uncertainty: calibration of heteroscedastic measurement noise in fluid flow reconstruction, quantification of prediction confidence degradation in multi-step forecasting of advection-diffusion dynamics, and out-of-distribution detection in convolutional autoencoders. These results suggest Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KANs) is a promising architecture for uncertainty-aware learning in scientific machine learning.

FLU-DYNDec 27, 2025

Uncertainty-Aware Flow Field Reconstruction Using SVGP Kolmogorov-Arnold Networks

Y. Sungtaek Ju

Reconstructing time-resolved flow fields from temporally sparse velocimetry measurements is critical for characterizing many complex thermal-fluid systems. We introduce a machine learning framework for uncertainty-aware flow reconstruction using sparse variational Gaussian processes in the Kolmogorov-Arnold network topology (SVGP-KAN). This approach extends the classical foundations of Linear Stochastic Estimation (LSE) and Spectral Analysis Modal Methods (SAMM) while enabling principled epistemic uncertainty quantification. We perform a systematic comparison of our framework with the classical reconstruction methods as well as Kalman filtering. Using synthetic data from pulsed impingement jet flows, we assess performance across fractional PIV sampling rates ranging from 0.5% to 10%. Evaluation metrics include reconstruction error, generalization gap, structure preservation, and uncertainty calibration. Our SVGP-KAN methods achieve reconstruction accuracy comparable to established methods, while also providing well-calibrated uncertainty estimates that reliably indicate when and where predictions degrade. The results demonstrate a robust, data-driven framework for flow field reconstruction with meaningful uncertainty quantification and offer practical guidance for experimental design in periodic flows.

19.8LGMar 20

SymCircuit: Bayesian Structure Inference for Tractable Probabilistic Circuits via Entropy-Regularized Reinforcement Learning

Y. Sungtaek Ju

Probabilistic circuit (PC) structure learning is hampered by greedy algorithms that make irreversible, locally optimal decisions. We propose SymCircuit, which replaces greedy search with a learned generative policy trained via entropy-regularized reinforcement learning. Instantiating the RL-as-inference framework in the PC domain, we show the optimal policy is a tempered Bayesian posterior, recovering the exact posterior when the regularization temperature is set inversely proportional to the dataset size. The policy is implemented as SymFormer, a grammar-constrained autoregressive Transformer with tree-relative self-attention that guarantees valid circuits at every generation step. We introduce option-level REINFORCE, restricting gradient updates to structural decisions rather than all tokens, yielding an SNR (signal to noise ratio) improvement and >10 times sample efficiency gain on the NLTCS dataset. A three-layer uncertainty decomposition (structural via model averaging, parametric via the delta method, leaf via conjugate Dirichlet-Categorical propagation) is grounded in the multilinear polynomial structure of PC outputs. On NLTCS, SymCircuit closes 93% of the gap to LearnSPN; preliminary results on Plants (69 variables) suggest scalability.