Stephen Baek

LG
h-index21
20papers
668citations
Novelty51%
AI Score50

20 Papers

LGApr 22, 2022
Federated Learning Enables Big Data for Rare Cancer Boundary Detection

Sarthak Pati, Ujjwal Baid, Brandon Edwards et al.

Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.

LGMar 22, 2023
Challenges and opportunities for machine learning in multiscale computational modeling

Phong C. H. Nguyen, Joseph B. Choi, H. S. Udaykumar et al.

Many mechanical engineering applications call for multiscale computational modeling and simulation. However, solving for complex multiscale systems remains computationally onerous due to the high dimensionality of the solution space. Recently, machine learning (ML) has emerged as a promising solution that can either serve as a surrogate for, accelerate or augment traditional numerical methods. Pioneering work has demonstrated that ML provides solutions to governing systems of equations with comparable accuracy to those obtained using direct numerical methods, but with significantly faster computational speed. These high-speed, high-fidelity estimations can facilitate the solving of complex multiscale systems by providing a better initial solution to traditional solvers. This paper provides a perspective on the opportunities and challenges of using ML for complex multiscale modeling and simulation. We first outline the current state-of-the-art ML approaches for simulating multiscale systems and highlight some of the landmark developments. Next, we discuss current challenges for ML in multiscale computational modeling, such as the data and discretization dependence, interpretability, and data sharing and collaborative platform development. Finally, we suggest several potential research directions for the future.

LGJul 21, 2022
Federated Learning on Adaptively Weighted Nodes by Bilevel Optimization

Yankun Huang, Qihang Lin, Nick Street et al.

We propose a federated learning method with weighted nodes in which the weights can be modified to optimize the model's performance on a separate validation set. The problem is formulated as a bilevel optimization where the inner problem is a federated learning problem with weighted nodes and the outer problem focuses on optimizing the weights based on the validation performance of the model returned from the inner problem. A communication-efficient federated optimization algorithm is designed to solve this bilevel optimization problem. Under an error-bound assumption, we analyze the generalization performance of the output model and identify scenarios when our method is in theory superior to training a model only locally and to federated learning with static and evenly distributed weights.

MTRL-SCINov 15, 2022
Artificial intelligence approaches for materials-by-design of energetic materials: state-of-the-art, challenges, and future directions

Joseph B. Choi, Phong C. H. Nguyen, Oishik Sen et al.

Artificial intelligence (AI) is rapidly emerging as an enabling tool for solving various complex materials design problems. This paper aims to review recent advances in AI-driven materials-by-design and their applications to energetic materials (EM). Trained with data from numerical simulations and/or physical experiments, AI models can assimilate trends and patterns within the design parameter space, identify optimal material designs (micro-morphologies, combinations of materials in composites, etc.), and point to designs with superior/targeted property and performance metrics. We review approaches focusing on such capabilities with respect to the three main stages of materials-by-design, namely representation learning of microstructure morphology (i.e., shape descriptors), structure-property-performance (S-P-P) linkage estimation, and optimization/design exploration. We provide a perspective view of these methods in terms of their potential, practicality, and efficacy towards the realization of materials-by-design. Specifically, methods in the literature are evaluated in terms of their capacity to learn from a small/limited number of data, computational complexity, generalizability/scalability to other material species and operating conditions, interpretability of the model predictions, and the burden of supervision/data annotation. Finally, we suggest a few promising future research directions for EM materials-by-design, such as meta-learning, active learning, Bayesian learning, and semi-/weakly-supervised learning, to bridge the gap between machine learning research and EM research.

MTRL-SCINov 8, 2022
A physics-aware deep learning model for energy localization in multiscale shock-to-detonation simulations of heterogeneous energetic materials

Phong C. H. Nguyen, Yen-Thi Nguyen, Pradeep K. Seshadri et al.

Predictive simulations of the shock-to-detonation transition (SDT) in heterogeneous energetic materials (EM) are vital to the design and control of their energy release and sensitivity. Due to the complexity of the thermo-mechanics of EM during the SDT, both macro-scale response and sub-grid mesoscale energy localization must be captured accurately. This work proposes an efficient and accurate multiscale framework for SDT simulations of EM. We introduce a new approach for SDT simulation by using deep learning to model the mesoscale energy localization of shock-initiated EM microstructures. The proposed multiscale modeling framework is divided into two stages. First, a physics-aware recurrent convolutional neural network (PARC) is used to model the mesoscale energy localization of shock-initiated heterogeneous EM microstructures. PARC is trained using direct numerical simulations (DNS) of hotspot ignition and growth within microstructures of pressed HMX material subjected to different input shock strengths. After training, PARC is employed to supply hotspot ignition and growth rates for macroscale SDT simulations. We show that PARC can play the role of a surrogate model in a multiscale simulation framework, while drastically reducing the computation cost and providing improved representations of the sub-grid physics. The proposed multiscale modeling approach will provide a new tool for material scientists in designing high-performance and safer energetic materials.

MTRL-SCIApr 4, 2022
PARC: Physics-Aware Recurrent Convolutional Neural Networks to Assimilate Meso-scale Reactive Mechanics of Energetic Materials

Phong C. H. Nguyen, Yen-Thi Nguyen, Joseph B. Choi et al.

The thermo-mechanical response of shock-initiated energetic materials (EM) is highly influenced by their microstructures, presenting an opportunity to engineer EM microstructure in a "materials-by-design" framework. However, the current design practice is limited, as a large ensemble of simulations is required to construct the complex EM structure-property-performance linkages. We present the Physics-Aware Recurrent Convolutional (PARC) Neural Network, a deep-learning algorithm capable of learning the mesoscale thermo-mechanics of EM from a modest number of high-resolution direct numerical simulations (DNS). Validation results demonstrated that PARC could predict the themo-mechanical response of shocked EM with a comparable accuracy to DNS but with notably less computation time. The physics awareness of PARC enhances its modeling capabilities and generalizability, especially when challenged in unseen prediction scenarios. We also demonstrate that visualizing the artificial neurons at PARC can shed light on important aspects of EM thermos-mechanics and provide an additional lens for conceptualizing EM.

FLU-DYNDec 4, 2025
Multi-resolution Physics-Aware Recurrent Convolutional Neural Network for Complex Flows

Xinlun Cheng, Joseph Choi, H. S. Udaykumar et al.

We present MRPARCv2, Multi-resolution Physics-Aware Recurrent Convolutional Neural Network, designed to model complex flows by embedding the structure of advection-diffusion-reaction equations and leveraging a multi-resolution architecture. MRPARCv2 introduces hierarchical discretization and cross-resolution feature communication to improve the accuracy and efficiency of flow simulations. We evaluate the model on a challenging 2D turbulent radiative layer dataset from The Well multi-physics benchmark repository and demonstrate significant improvements when compared to the single resolution baseline model, in both Variance Scaled Root Mean Squared Error and physics-driven metrics, including turbulent kinetic energy spectra and mass-temperature distributions. Despite having 30% fewer trainable parameters, MRPARCv2 outperforms its predecessor by up to 50% in roll-out prediction error and 86% in spectral error. A preliminary study on uncertainty quantification was performed, and we also analyzed the model's performance under different levels of abstractions of the flow, specifically on sampling subsets of field variables. We find that the absence of physical constraints on the equation of state (EOS) in the network architecture leads to degraded accuracy. A variable substitution experiment confirms that this issue persists regardless of which physical quantity is predicted directly. Our findings highlight the advantages of multi-resolution inductive bias for capturing multi-scale flow dynamics and suggest the need for future PIML models to embed EOS knowledge to enhance physical fidelity.

LGFeb 5, 2024
Constrained Synthesis with Projected Diffusion Models

Jacob K Christopher, Stephen Baek, Ferdinando Fioretto

This paper introduces an approach to endow generative diffusion processes the ability to satisfy and certify compliance with constraints and physical principles. The proposed method recast the traditional sampling process of generative diffusion models as a constrained optimization problem, steering the generated data distribution to remain within a specified region to ensure adherence to the given constraints. These capabilities are validated on applications featuring both convex and challenging, non-convex, constraints as well as ordinary differential equations, in domains spanning from synthesizing new materials with precise morphometric properties, generating physics-informed motion, optimizing paths in planning scenarios, and human motion synthesis.

LGFeb 19, 2024
PARCv2: Physics-aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics Modeling

Phong C. H. Nguyen, Xinlun Cheng, Shahab Azarfar et al.

Modeling unsteady, fast transient, and advection-dominated physics problems is a pressing challenge for physics-aware deep learning (PADL). The physics of complex systems is governed by large systems of partial differential equations (PDEs) and ancillary constitutive models with nonlinear structures, as well as evolving state fields exhibiting sharp gradients and rapidly deforming material interfaces. Here, we investigate an inductive bias approach that is versatile and generalizable to model generic nonlinear field evolution problems. Our study focuses on the recent physics-aware recurrent convolutions (PARC), which incorporates a differentiator-integrator architecture that inductively models the spatiotemporal dynamics of generic physical systems. We extend the capabilities of PARC to simulate unsteady, transient, and advection-dominant systems. The extended model, referred to as PARCv2, is equipped with differential operators to model advection-reaction-diffusion equations, as well as a hybrid integral solver for stable, long-time predictions. PARCv2 is tested on both standard benchmark problems in fluid dynamics, namely Burgers and Navier-Stokes equations, and then applied to more complex shock-induced reaction problems in energetic materials. We evaluate the behavior of PARCv2 in comparison to other physics-informed and learning bias models and demonstrate its potential to model unsteady and advection-dominant dynamics regimes.

LGSep 17, 2025
Towards a Physics Foundation Model

Florian Wiesner, Matthias Wessling, Stephen Baek

Foundation models have revolutionized natural language processing through a ``train once, deploy anywhere'' paradigm, where a single pre-trained model adapts to countless downstream tasks without retraining. Access to a Physics Foundation Model (PFM) would be transformative -- democratizing access to high-fidelity simulations, accelerating scientific discovery, and eliminating the need for specialized solver development. Yet current physics-aware machine learning approaches remain fundamentally limited to single, narrow domains and require retraining for each new system. We present the General Physics Transformer (GPhyT), trained on 1.8 TB of diverse simulation data, that demonstrates foundation model capabilities are achievable for physics. Our key insight is that transformers can learn to infer governing dynamics from context, enabling a single model to simulate fluid-solid interactions, shock waves, thermal convection, and multi-phase dynamics without being told the underlying equations. GPhyT achieves three critical breakthroughs: (1) superior performance across multiple physics domains, outperforming specialized architectures by up to 29x, (2) zero-shot generalization to entirely unseen physical systems through in-context learning, and (3) stable long-term predictions through 50-timestep rollouts. By establishing that a single model can learn generalizable physical principles from data alone, this work opens the path toward a universal PFM that could transform computational science and engineering.

LGOct 8, 2025
A physics-aware deep learning model for shear band formation around collapsing pores in shocked reactive materials

Xinlun Cheng, Bingzhe Chen, Joseph Choi et al.

Modeling shock-to-detonation phenomena in energetic materials (EMs) requires capturing complex physical processes such as strong shocks, rapid changes in microstructural morphology, and nonlinear dynamics of chemical reaction fronts. These processes participate in energy localization at hotspots, which initiate chemical energy release leading to detonation. This study addresses the formation of hotspots in crystalline EMs subjected to weak-to-moderate shock loading, which, despite its critical relevance to the safe storage and handling of EMs, remains underexplored compared to the well-studied strong shock conditions. To overcome the computational challenges associated with direct numerical simulations, we advance the Physics-Aware Recurrent Convolutional Neural Network (PARCv2), which has been shown to be capable of predicting strong shock responses in EMs. We improved the architecture of PARCv2 to rapidly predict shear localizations and plastic heating, which play important roles in the weak-to-moderate shock regime. PARCv2 is benchmarked against two widely used physics-informed models, namely, Fourier neural operator and neural ordinary differential equation; we demonstrate its superior performance in capturing the spatiotemporal dynamics of shear band formation. While all models exhibit certain failure modes, our findings underscore the importance of domain-specific considerations in developing robust AI-accelerated simulation tools for reactive materials.

COMP-PHJun 15, 2025
Latent Representation Learning of Multi-scale Thermophysics: Application to Dynamics in Shocked Porous Energetic Material

Shahab Azarfar, Joseph B. Choi, Phong CH. Nguyen et al.

Coupling of physics across length and time scales plays an important role in the response of microstructured materials to external loads. In a multi-scale framework, unresolved (subgrid) meso-scale dynamics is upscaled to the homogenized (macro-scale) representation of the heterogeneous material through closure models. Deep learning models trained using meso-scale simulation data are now a popular route to assimilate such closure laws. However, meso-scale simulations are computationally taxing, posing practical challenges in training deep learning-based surrogate models from scratch. In this work, we investigate an alternative meta-learning approach motivated by the idea of tokenization in natural language processing. We show that one can learn a reduced representation of the micro-scale physics to accelerate the meso-scale learning process by tokenizing the meso-scale evolution of the physical fields involved in an archetypal, albeit complex, reactive dynamics problem, \textit{viz.}, shock-induced energy localization in a porous energetic material. A probabilistic latent representation of \textit{micro}-scale dynamics is learned as building blocks for \textit{meso}-scale dynamics. The \textit{meso-}scale latent dynamics model learns the correlation between neighboring building blocks by training over a small dataset of meso-scale simulations. We compare the performance of our model with a physics-aware recurrent convolutional neural network (PARC) trained only on the full meso-scale dataset. We demonstrate that our model can outperform PARC with scarce meso-scale data. The proposed approach accelerates the development of closure models by leveraging inexpensive micro-scale simulations and fast training over a small meso-scale dataset, and can be applied to a range of multi-scale modeling problems.

LGOct 17, 2024
FedPAE: Peer-Adaptive Ensemble Learning for Asynchronous and Model-Heterogeneous Federated Learning

Brianna Mueller, W. Nick Street, Stephen Baek et al.

Federated learning (FL) enables multiple clients with distributed data sources to collaboratively train a shared model without compromising data privacy. However, existing FL paradigms face challenges due to heterogeneity in client data distributions and system capabilities. Personalized federated learning (pFL) has been proposed to mitigate these problems, but often requires a shared model architecture and a central entity for parameter aggregation, resulting in scalability and communication issues. More recently, model-heterogeneous FL has gained attention due to its ability to support diverse client models, but existing methods are limited by their dependence on a centralized framework, synchronized training, and publicly available datasets. To address these limitations, we introduce Federated Peer-Adaptive Ensemble Learning (FedPAE), a fully decentralized pFL algorithm that supports model heterogeneity and asynchronous learning. Our approach utilizes a peer-to-peer model sharing mechanism and ensemble selection to achieve a more refined balance between local and global information. Experimental results show that FedPAE outperforms existing state-of-the-art pFL algorithms, effectively managing diverse client capabilities and demonstrating robustness against statistical heterogeneity.

CVMar 3, 2021
On the role of depth predictions for 3D human pose estimation

Alec Diaz-Arias, Mitchell Messmore, Dmitriy Shin et al.

Following the successful application of deep convolutional neural networks to 2d human pose estimation, the next logical problem to solve is 3d human pose estimation from monocular images. While previous solutions have shown some success, they do not fully utilize the depth information from the 2d inputs. With the goal of addressing this depth ambiguity, we build a system that takes 2d joint locations as input along with their estimated depth value and predicts their 3d positions in camera coordinates. Given the inherent noise and inaccuracy from estimating depth maps from monocular images, we perform an extensive statistical analysis showing that given this noise there is still a statistically significant correlation between the predicted depth values and the third coordinate of camera coordinates. We further explain how the state-of-the-art results we achieve on the H3.6M validation set are due to the additional input of depth. Notably, our results are produced on neural network that accepts a low dimensional input and be integrated into a real-time system. Furthermore, our system can be combined with an off-the-shelf 2d pose detector and a depth map predictor to perform 3d pose estimation in the wild.

LGDec 18, 2019
Incremental ELMVIS for unsupervised learning

Anton Akusok, Emil Eirola, Yoan Miche et al.

An incremental version of the ELMVIS+ method is proposed in this paper. It iteratively selects a few best fitting data samples from a large pool, and adds them to the model. The method keeps high speed of ELMVIS+ while allowing for much larger possible sample pools due to lower memory requirements. The extension is useful for reaching a better local optimum with greedy optimization of ELMVIS, and the data structure can be specified in semi-supervised optimization. The major new application of incremental ELMVIS is not to visualization, but to a general dataset processing. The method is capable of learning dependencies from non-organized unsupervised data -- either reconstructing a shuffled dataset, or learning dependencies in complex high-dimensional space. The results are interesting and promising, although there is space for improvements.

SDJun 16, 2019
Multi-scale Embedded CNN for Music Tagging (MsE-CNN)

Nima Hamidi, Mohsen Vahidzadeh, Stephen Baek

Convolutional neural networks (CNN) recently gained notable attraction in a variety of machine learning tasks: including music classification and style tagging. In this work, we propose implementing intermediate connections to the CNN architecture to facilitate the transfer of multi-scale/level knowledge between different layers. Our novel model for music tagging shows significant improvement in comparison to the proposed approaches in the literature, due to its ability to carry low-level timbral features to the last layer.

IVMar 26, 2019
Deep segmentation networks predict survival of non-small cell lung cancer

Stephen Baek, Yusen He, Bryan G. Allen et al.

Non-small-cell lung cancer (NSCLC) represents approximately 80-85% of lung cancer diagnoses and is the leading cause of cancer-related death worldwide. Recent studies indicate that image-based radiomics features from positron emission tomography-computed tomography (PET/CT) images have predictive power on NSCLC outcomes. To this end, easily calculated functional features such as the maximum and the mean of standard uptake value (SUV) and total lesion glycolysis (TLG) are most commonly used for NSCLC prognostication, but their prognostic value remains controversial. Meanwhile, convolutional neural networks (CNN) are rapidly emerging as a new premise for cancer image analysis, with significantly enhanced predictive power compared to other hand-crafted radiomics features. Here we show that CNN trained to perform the tumor segmentation task, with no other information than physician contours, identify a rich set of survival-related image features with remarkable prognostic value. In a retrospective study on 96 NSCLC patients before stereotactic-body radiotherapy (SBRT), we found that the CNN segmentation algorithm (U-Net) trained for tumor segmentation in PET/CT images, contained features having strong correlation with 2- and 5-year overall and disease-specific survivals. The U-net algorithm has not seen any other clinical information (e.g. survival, age, smoking history) than the images and the corresponding tumor contours provided by physicians. Furthermore, through visualization of the U-Net, we also found convincing evidence that the regions of progression appear to match with the regions where the U-Net features identified patterns that predicted higher likelihood of death. We anticipate our findings will be a starting point for more sophisticated non-intrusive patient specific cancer prognosis determination.

CVDec 3, 2018
ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Zhiyu Sun, Ethan Rooke, Jerome Charton et al.

In this paper, we propose a novel formulation to extend CNNs to two-dimensional (2D) manifolds using orthogonal basis functions, called Zernike polynomials. In many areas, geometric features play a key role in understanding scientific phenomena. Thus, an ability to codify geometric features into a mathematical quantity can be critical. Recently, convolutional neural networks (CNNs) have demonstrated the promising capability of extracting and codifying features from visual information. However, the progress has been concentrated in computer vision applications where there exists an inherent grid-like structure. In contrast, many geometry processing problems are defined on curved surfaces, and the generalization of CNNs is not quite trivial. The difficulties are rooted in the lack of key ingredients such as the canonical grid-like representation, the notion of consistent orientation, and a compatible local topology across the domain. In this paper, we prove that the convolution of two functions can be represented as a simple dot product between Zernike polynomial coefficients; and the rotation of a convolution kernel is essentially a set of 2-by-2 rotation matrices applied to the coefficients. As such, the key contribution of this work resides in a concise but rigorous mathematical generalization of the CNN building blocks.

LGJun 20, 2018
Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Zhiyu Sun, Jia Lu, Stephen Baek

Convolutional neural networks (ConvNets) have demonstrated an exceptional capacity to discern visual patterns from digital images and signals. Unfortunately, such powerful ConvNets do not generalize well to arbitrary-shaped manifolds, where data representation does not fit into a tensor-like grid. Hence, many fields of science and engineering, where data points possess some manifold structure, cannot enjoy the full benefits of the recent advances in ConvNets. The aneurysm wall stress estimation problem introduced in this paper is one of many such problems. The problem is well-known to be of a paramount clinical importance, but yet, traditional ConvNets cannot be applied due to the manifold structure of the data, neither does the state-of-the-art geometric ConvNets perform well. Motivated by this, we propose a new geometric ConvNet method named ZerNet, which builds upon our novel mathematical generalization of convolution and pooling operations on manifolds. Our study shows that the ZerNet outperforms the other state-of-the-art geometric ConvNets in terms of accuracy.

GROct 17, 2017
Embedded Spectral Descriptors: Learning the point-wise correspondence metric via Siamese neural networks

Zhiyu Sun, Yusen He, Andrey Gritsenko et al.

A robust and informative local shape descriptor plays an important role in mesh registration. In this regard, spectral descriptors that are based on the spectrum of the Laplace-Beltrami operator have been a popular subject of research for the last decade due to their advantageous properties, such as isometry invariance. Despite such, however, spectral descriptors often fail to give a correct similarity measure for non-isometric cases where the metric distortion between the models is large. Hence, they are not reliable for correspondence matching problems when the models are not isometric. In this paper, it is proposed a method to improve the similarity metric of spectral descriptors for correspondence matching problems. We embed a spectral shape descriptor into a different metric space where the Euclidean distance between the elements directly indicates the geometric dissimilarity. We design and train a Siamese neural network to find such an embedding, where the embedded descriptors are promoted to rearrange based on the geometric similarity. We demonstrate our approach can significantly enhance the performance of the conventional spectral descriptors by the simple augmentation achieved via the Siamese neural network in comparison to other state-of-the-art methods.