LGDec 27, 2024
Estimation of System Parameters Including Repeated Cross-Sectional Data through Emulator-Informed Deep Generative ModelHyunwoo Cho, Sung Woong Cho, Hyeontae Jo et al.
Differential equations (DEs) are crucial for modeling the evolution of natural or engineered systems. Traditionally, the parameters in DEs are adjusted to fit data from system observations. However, in fields such as politics, economics, and biology, available data are often independently collected at distinct time points from different subjects (i.e., repeated cross-sectional (RCS) data). Conventional optimization techniques struggle to accurately estimate DE parameters when RCS data exhibit various heterogeneities, leading to a significant loss of information. To address this issue, we propose a new estimation method called the emulator-informed deep-generative model (EIDGM), designed to handle RCS data. Specifically, EIDGM integrates a physics-informed neural network-based emulator that immediately generates DE solutions and a Wasserstein generative adversarial network-based parameter generator that can effectively mimic the RCS data. We evaluated EIDGM on exponential growth, logistic population models, and the Lorenz system, demonstrating its superior ability to accurately capture parameter distributions. Additionally, we applied EIDGM to an experimental dataset of Amyloid beta 40 and beta 42, successfully capturing diverse parameter distribution shapes. This shows that EIDGM can be applied to model a wide range of systems and extended to uncover the operating principles of systems based on limited data.
LGJul 15, 2025
Learning from Imperfect Data: Robust Inference of Dynamic Systems using Simulation-based Generative ModelHyunwoo Cho, Hyeontae Jo, Hyung Ju Hwang
System inference for nonlinear dynamic models, represented by ordinary differential equations (ODEs), remains a significant challenge in many fields, particularly when the data are noisy, sparse, or partially observable. In this paper, we propose a Simulation-based Generative Model for Imperfect Data (SiGMoID) that enables precise and robust inference for dynamic systems. The proposed approach integrates two key methods: (1) physics-informed neural networks with hyper-networks that constructs an ODE solver, and (2) Wasserstein generative adversarial networks that estimates ODE parameters by effectively capturing noisy data distributions. We demonstrate that SiGMoID quantifies data noise, estimates system parameters, and infers unobserved system components. Its effectiveness is validated validated through realistic experimental examples, showcasing its broad applicability in various domains, from scientific research to engineered systems, and enabling the discovery of full system dynamics.
LGJul 8, 2025
Neural Network-Based Parameter Estimation for Non-Autonomous Differential Equations with Discontinuous SignalsHyeontae Jo, Krešimir Josić, Jae Kyoung Kim
Non-autonomous differential equations are crucial for modeling systems influenced by external signals, yet fitting these models to data becomes particularly challenging when the signals change abruptly. To address this problem, we propose a novel parameter estimation method utilizing functional approximations with artificial neural networks. Our approach, termed Harmonic Approximation of Discontinuous External Signals using Neural Networks (HADES-NN), operates in two iterated stages. In the first stage, the algorithm employs a neural network to approximate the discontinuous signal with a smooth function. In the second stage, it uses this smooth approximate signal to estimate model parameters. HADES-NN gives highly accurate and precise parameter estimates across various applications, including circadian clock systems regulated by external light inputs measured via wearable devices and the mating response of yeast to external pheromone signals. HADES-NN greatly extends the range of model systems that can be fit to real-world measurements.
MLApr 23, 2024
Estimating the Distribution of Parameters in Differential Equations with Repeated Cross-Sectional DataHyeontae Jo, Sung Woong Cho, Hyung Ju Hwang
Differential equations are pivotal in modeling and understanding the dynamics of various systems, offering insights into their future states through parameter estimation fitted to time series data. In fields such as economy, politics, and biology, the observation data points in the time series are often independently obtained (i.e., Repeated Cross-Sectional (RCS) data). With RCS data, we found that traditional methods for parameter estimation in differential equations, such as using mean values of time trajectories or Gaussian Process-based trajectory generation, have limitations in estimating the shape of parameter distributions, often leading to a significant loss of data information. To address this issue, we introduce a novel method, Estimation of Parameter Distribution (EPD), providing accurate distribution of parameters without loss of data information. EPD operates in three main steps: generating synthetic time trajectories by randomly selecting observed values at each time point, estimating parameters of a differential equation that minimize the discrepancy between these trajectories and the true solution of the equation, and selecting the parameters depending on the scale of discrepancy. We then evaluated the performance of EPD across several models, including exponential growth, logistic population models, and target cell-limited models with delayed virus production, demonstrating its superiority in capturing the shape of parameter distributions. Furthermore, we applied EPD to real-world datasets, capturing various shapes of parameter distributions rather than a normal distribution. These results effectively address the heterogeneity within systems, marking a substantial progression in accurately modeling systems using RCS data.
LGSep 18, 2021
Machine Learning-Based COVID-19 Patients Triage Algorithm using Patient-Generated Health Data from Nationwide Multicenter DatabaseMin Sue Park, Hyeontae Jo, Haeun Lee et al.
A prompt severity assessment model of patients confirmed for having infectious diseases could enable efficient diagnosis while alleviating burden on the medical system. This study aims to develop a SARS-CoV-2 severity assessment model and establish a medical system that allows patients to check the severity of their cases and informs them to visit the appropriate clinic center based on past treatment data of other patients with similar severity levels. This paper provides the development processes of a severity assessment model using machine learning techniques and its application on SARS-CoV-2 patients. The proposed model is trained on a nationwide dataset provided by a Korean government agency and only requires patients' basic personal data, allowing them to judge the severity of their own cases. After modeling, the boosting-based decision tree model was selected as the classifier while mortality rate was interpreted as the probability score. The dataset was collected from all Korean citizens who were confirmed with COVID-19 between February, 2020 and July, 2021. The experiments achieved high model performance with an approximate precision of $0{\cdot}923$ and AUROC score of $0{\cdot}950$ [$95$% Tolerance Interval $0{\cdot}940$-$0{\cdot}958$, $95$% Confidence Interval $0{\cdot}949$-$0{\cdot}950$]. Moreover, our experiments identified the most important variables affecting the severity in the model via sensitivity analysis. The prompt severity assessment model for managing infectious people has been attained through using a nationwide dataset. It has demonstrated its superior performance by surpassing that of conventional risk assessments. With the model's high performance and easily accessible features, the triage algorithm is expected to be particularly useful when patients monitor their health status by themselves through smartphone applications.
NANov 22, 2019
Trend to Equilibrium for the Kinetic Fokker-Planck Equation via the Neural Network ApproachHyung Ju Hwang, Jin Woo Jang, Hyeontae Jo et al.
The issue of the relaxation to equilibrium has been at the core of the kinetic theory of rarefied gas dynamics. In the paper, we introduce the Deep Neural Network (DNN) approximated solutions to the kinetic Fokker-Planck equation in a bounded interval and study the large-time asymptotic behavior of the solutions and other physically relevant macroscopic quantities. We impose the varied types of boundary conditions including the inflow-type and the reflection-type boundaries as well as the varied diffusion and friction coefficients and study the boundary effects on the asymptotic behaviors. These include the predictions on the large-time behaviors of the pointwise values of the particle distribution and the macroscopic physical quantities including the total kinetic energy, the entropy, and the free energy. We also provide the theoretical supports for the pointwise convergence of the neural network solutions to the \textit{a priori} analytic solutions. We use the library \textit{PyTorch}, the activation function \textit{tanh} between layers, and the \textit{Adam} optimizer for the Deep Learning algorithm.
NAJul 27, 2019
Deep Neural Network Approach to Forward-Inverse ProblemsHyeontae Jo, Hwijae Son, Hyung Ju Hwang et al.
In this paper, we construct approximated solutions of Differential Equations (DEs) using the Deep Neural Network (DNN). Furthermore, we present an architecture that includes the process of finding model parameters through experimental data, the inverse problem. That is, we provide a unified framework of DNN architecture that approximates an analytic solution and its model parameters simultaneously. The architecture consists of a feed forward DNN with non-linear activation functions depending on DEs, automatic differentiation, reduction of order, and gradient based optimization method. We also prove theoretically that the proposed DNN solution converges to an analytic solution in a suitable function space for fundamental DEs. Finally, we perform numerical experiments to validate the robustness of our simplistic DNN architecture for 1D transport equation, 2D heat equation, 2D wave equation, and the Lotka-Volterra system.