Kazuma Kobayashi

LG
h-index47
27papers
492citations
Novelty47%
AI Score54

27 Papers

CVJul 22, 2023Code
Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

Xinyue Hu, Lin Gu, Qiyuan An et al.

To contribute to automating the medical vision-language model, we propose a novel Chest-Xray Difference Visual Question Answering (VQA) task. Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them. This is consistent with the radiologist's diagnosis practice that compares the current image with the reference before concluding the report. We collect a new dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from 164,324 pairs of main and reference images. Compared to existing medical VQA datasets, our questions are tailored to the Assessment-Diagnosis-Intervention-Evaluation treatment procedure used by clinical professionals. Meanwhile, we also propose a novel expert knowledge-aware graph representation learning model to address this task. The proposed baseline model leverages expert knowledge such as anatomical structure prior, semantic, and spatial knowledge to construct a multi-relationship graph, representing the image differences between two images for the image difference VQA task. The dataset and code can be found at https://github.com/Holipori/MIMIC-Diff-VQA. We believe this work would further push forward the medical vision language model.

CVFeb 19, 2023
Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning

Xinyue Hu, Lin Gu, Kazuma Kobayashi et al.

Medical visual question answering (VQA) aims to answer clinically relevant questions regarding input medical images. This technique has the potential to improve the efficiency of medical professionals while relieving the burden on the public health system, particularly in resource-poor countries. Existing medical VQA methods tend to encode medical images and learn the correspondence between visual features and questions without exploiting the spatial, semantic, or medical knowledge behind them. This is partially because of the small size of the current medical VQA dataset, which often includes simple questions. Therefore, we first collected a comprehensive and large-scale medical VQA dataset, focusing on chest X-ray images. The questions involved detailed relationships, such as disease names, locations, levels, and types in our dataset. Based on this dataset, we also propose a novel baseline method by constructing three different relationship graphs: spatial relationship, semantic relationship, and implicit relationship graphs on the image regions, questions, and semantic labels. The answer and graph reasoning paths are learned for different questions.

CVMar 7, 2023
Sketch-based Medical Image Retrieval

Kazuma Kobayashi, Lin Gu, Ryuichiro Hataya et al.

The amount of medical images stored in hospitals is increasing faster than ever; however, utilizing the accumulated medical images has been limited. This is because existing content-based medical image retrieval (CBMIR) systems usually require example images to construct query vectors; nevertheless, example images cannot always be prepared. Besides, there can be images with rare characteristics that make it difficult to find similar example images, which we call isolated samples. Here, we introduce a novel sketch-based medical image retrieval (SBMIR) system that enables users to find images of interest without example images. The key idea lies in feature decomposition of medical images, whereby the entire feature of a medical image can be decomposed into and reconstructed from normal and abnormal features. By extending this idea, our SBMIR system provides an easy-to-use two-step graphical user interface: users first select a template image to specify a normal feature and then draw a semantic sketch of the disease on the template image to represent an abnormal feature. Subsequently, it integrates the two kinds of input to construct a query vector and retrieves reference images with the closest reference vectors. Using two datasets, ten healthcare professionals with various clinical backgrounds participated in the user test for evaluation. As a result, our SBMIR system enabled users to overcome previous challenges, including image retrieval based on fine-grained image characteristics, image retrieval without example images, and image retrieval for isolated samples. Our SBMIR system achieves flexible medical image retrieval on demand, thereby expanding the utility of medical image databases.

APNov 23, 2022
Digital Twin-Centered Hybrid Data-Driven Multi-Stage Deep Learning Framework for Enhanced Nuclear Reactor Power Prediction

James Daniell, Kazuma Kobayashi, Ayodeji Alajo et al.

The accurate and efficient modeling of nuclear reactor transients is crucial for ensuring safe and optimal reactor operation. Traditional physics-based models, while valuable, can be computationally intensive and may not fully capture the complexities of real-world reactor behavior. This paper introduces a novel hybrid digital twin-focused multi-stage deep learning framework that addresses these limitations, offering a faster and more robust solution for predicting the final steady-state power of reactor transients. By leveraging a combination of feed-forward neural networks with both classification and regression stages, and training on a unique dataset that integrates real-world measurements of reactor power and controls state from the Missouri University of Science and Technology Reactor (MSTR) with noise-enhanced simulated data, our approach achieves remarkable accuracy (96% classification, 2.3% MAPE). The incorporation of simulated data with noise significantly improves the model's generalization capabilities, mitigating the risk of overfitting. Designed as a digital twin supporting system, this framework integrates real-time, synchronized predictions of reactor state transitions, enabling dynamic operational monitoring and optimization. This innovative solution not only enables rapid and precise prediction of reactor behavior but also has the potential to revolutionize nuclear reactor operations, facilitating enhanced safety protocols, optimized performance, and streamlined decision-making processes. By aligning data-driven insights with the principles of digital twins, this work lays the groundwork for adaptable and scalable solutions in nuclear system management.

CVSep 6, 2023
Expert Uncertainty and Severity Aware Chest X-Ray Classification by Multi-Relationship Graph Learning

Mengliang Zhang, Xinyue Hu, Lin Gu et al.

Patients undergoing chest X-rays (CXR) often endure multiple lung diseases. When evaluating a patient's condition, due to the complex pathologies, subtle texture changes of different lung lesions in images, and patient condition differences, radiologists may make uncertain even when they have experienced long-term clinical training and professional guidance, which makes much noise in extracting disease labels based on CXR reports. In this paper, we re-extract disease labels from CXR reports to make them more realistic by considering disease severity and uncertainty in classification. Our contributions are as follows: 1. We re-extracted the disease labels with severity and uncertainty by a rule-based approach with keywords discussed with clinical experts. 2. To further improve the explainability of chest X-ray diagnosis, we designed a multi-relationship graph learning method with an expert uncertainty-aware loss function. 3. Our multi-relationship graph learning method can also interpret the disease classification results. Our experimental results show that models considering disease severity and uncertainty outperform previous state-of-the-art methods.

LGJan 17, 2023
Explainable, Interpretable & Trustworthy AI for Intelligent Digital Twin: Case Study on Remaining Useful Life

Kazuma Kobayashi, Syed Bahauddin Alam

Artificial intelligence (AI) and Machine learning (ML) are increasingly used in energy and engineering systems, but these models must be fair, unbiased, and explainable. It is critical to have confidence in AI's trustworthiness. ML techniques have been useful in predicting important parameters and in improving model performance. However, for these AI techniques to be useful for making decisions, they need to be audited, accounted for, and easy to understand. Therefore, the use of explainable AI (XAI) and interpretable machine learning (IML) is crucial for the accurate prediction of prognostics, such as remaining useful life (RUL), in a digital twin system, to make it intelligent while ensuring that the AI model is transparent in its decision-making processes and that the predictions it generates can be understood and trusted by users. By using AI that is explainable, interpretable, and trustworthy, intelligent digital twin systems can make more accurate predictions of RUL, leading to better maintenance and repair planning, and ultimately, improved system performance. The objective of this paper is to explain the ideas of XAI and IML and to justify the important role of AI/ML in the digital twin framework and components, which requires XAI to understand the prediction better. This paper explains the importance of XAI and IML in both local and global aspects to ensure the use of trustworthy AI/ML applications for RUL prediction. We used the RUL prediction for the XAI and IML studies and leveraged the integrated Python toolbox for interpretable machine learning~(PiML).

MLAug 15, 2023
Deep Neural Operator Driven Real Time Inference for Nuclear Systems to Enable Digital Twin Solutions

Kazuma Kobayashi, Syed Bahauddin Alam

This paper focuses on the feasibility of Deep Neural Operator (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) for nuclear energy systems. Through benchmarking and evaluation, this study showcases the generalizability and computational efficiency of DeepONet in solving a challenging particle transport problem. DeepONet also exhibits remarkable prediction accuracy and speed, outperforming traditional ML methods, making it a suitable algorithm for real-time DT inference. However, the application of DeepONet also reveals challenges related to optimal sensor placement and model evaluation, critical aspects of real-world implementation. Addressing these challenges will further enhance the method's practicality and reliability. Overall, DeepONet presents a promising and transformative nuclear engineering research and applications tool. Its accurate prediction and computational efficiency capabilities can revolutionize DT systems, advancing nuclear engineering research. This study marks an important step towards harnessing the power of surrogate modeling techniques in critical engineering domains.

LGJan 17, 2023
Improved generalization with deep neural operators for engineering systems: Path towards digital twin

Kazuma Kobayashi, James Daniell, Syed Bahauddin Alam

Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs. In this work, we evaluate the capabilities of Deep Operator Networks (DeepONets), an ONets implementation using a branch/trunk architecture. Three test cases are studied: a system of ODEs, a general diffusion system, and the convection/diffusion Burgers equation. It is demonstrated that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain while achieving zero shot (without retraining) capability. More importantly, when evaluated on unseen scenarios (zero shot feature), the trained models exhibit excellent generalization ability. This underscores ONets vital niche for surrogate modeling and digital twin development across physical systems. While convection-diffusion poses a greater challenge, the results confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates.

AIDec 29, 2025
Agentic Physical AI toward a Domain-Specific Foundation Model for Nuclear Reactor Control

Yoonpyo Lee, Kazuma Kobayashi, Sai Puppala et al.

The prevailing paradigm in AI for physical systems, scaling general-purpose foundation models toward universal multimodal reasoning, confronts a fundamental barrier at the control interface. Recent benchmarks show that even frontier vision-language models achieve only 50-53% accuracy on basic quantitative physics tasks, behaving as approximate guessers that preserve semantic plausibility while violating physical constraints. This input unfaithfulness is not a scaling deficiency but a structural limitation. Perception-centric architectures optimize parameter-space imitation, whereas safety-critical control demands outcome-space guarantees over executed actions. Here, we present a fundamentally different pathway toward domain-specific foundation models by introducing compact language models operating as Agentic Physical AI, in which policy optimization is driven by physics-based validation rather than perceptual inference. We train a 360-million-parameter model on synthetic reactor control scenarios, scaling the dataset from 10^3 to 10^5 examples. This induces a sharp phase transition absent in general-purpose models. Small-scale systems exhibit high-variance imitation with catastrophic tail risk, while large-scale models undergo variance collapse exceeding 500x reduction, stabilizing execution-level behavior. Despite balanced exposure to four actuation families, the model autonomously rejects approximately 70% of the training distribution and concentrates 95% of runtime execution on a single-bank strategy. Learned representations transfer across distinct physics and continuous input modalities without architectural modification.

LGSep 30, 2022
Leveraging Industry 4.0 -- Deep Learning, Surrogate Model and Transfer Learning with Uncertainty Quantification Incorporated into Digital Twin for Nuclear System

M. Rahman, Abid Khan, Sayeed Anowar et al.

Industry 4.0 targets the conversion of the traditional industries into intelligent ones through technological revolution. This revolution is only possible through innovation, optimization, interconnection, and rapid decision-making capability. Numerical models are believed to be the key components of Industry 4.0, facilitating quick decision-making through simulations instead of costly experiments. However, numerical investigation of precise, high-fidelity models for optimization or decision-making is usually time-consuming and computationally expensive. In such instances, data-driven surrogate models are excellent substitutes for fast computational analysis and the probabilistic prediction of the output parameter for new input parameters. The emergence of Internet of Things (IoT) and Machine Learning (ML) has made the concept of surrogate modeling even more viable. However, these surrogate models contain intrinsic uncertainties, originate from modeling defects, or both. These uncertainties, if not quantified and minimized, can produce a skewed result. Therefore, proper implementation of uncertainty quantification techniques is crucial during optimization, cost reduction, or safety enhancement processes analysis. This chapter begins with a brief overview of the concept of surrogate modeling, transfer learning, IoT and digital twins. After that, a detailed overview of uncertainties, uncertainty quantification frameworks, and specifics of uncertainty quantification methodologies for a surrogate model linked to a digital twin is presented. Finally, the use of uncertainty quantification approaches in the nuclear industry has been addressed.

SYSep 25, 2022
Machine Learning and Artificial Intelligence-Driven Multi-Scale Modeling for High Burnup Accident-Tolerant Fuels for Light Water-Based SMR Applications

Md. Shamim Hassan, Abid Hossain Khan, Richa Verma et al.

The concept of small modular reactor has changed the outlook for tackling future energy crises. This new reactor technology is very promising considering its lower investment requirements, modularity, design simplicity, and enhanced safety features. The application of artificial intelligence-driven multi-scale modeling (neutronics, thermal hydraulics, fuel performance, etc.) incorporating Digital Twin and associated uncertainties in the research of small modular reactors is a recent concept. In this work, a comprehensive study is conducted on the multiscale modeling of accident-tolerant fuels. The application of these fuels in the light water-based small modular reactors is explored. This chapter also focuses on the application of machine learning and artificial intelligence in the design optimization, control, and monitoring of small modular reactors. Finally, a brief assessment of the research gap on the application of artificial intelligence to the development of high burnup composite accident-tolerant fuels is provided. Necessary actions to fulfill these gaps are also discussed.

60.3NEApr 14
Gradient-Free Continual Learning in Spiking Neural Networks via Inter-Spike Interval Regularization

Samrendra Roy, Kazuma Kobayashi, Souvik Chakraborty et al.

Continual learning, the ability to acquire new tasks sequentially without forgetting prior knowledge, is essential for deploying neural networks in dynamic real-world environments, from nuclear digital twin monitoring to grid-edge fault detection. Existing synaptic importance methods, such as Elastic Weight Consolidation (EWC) and Synaptic Intelligence (SI), rely on gradient computation, making them incompatible with neuromorphic hardware that lacks backpropagation support. We propose ISI-CV, the first gradient-free synaptic importance metric for SNN continual learning, derived from the Coefficient of Variation (CV) of Inter-Spike Intervals (ISIs). Neurons that fire regularly (low CV) encode stable, task-relevant features and are protected from overwriting; neurons with irregular firing are permitted to adapt freely. ISI-CV requires only spike time counters and integer arithmetic, all of which are native to every neuromorphic chip. We evaluate on four benchmarks of increasing difficulty: Split-MNIST, Permuted-MNIST, Split-FashionMNIST, and Split-N-MNIST using real Dynamic Vision Sensor (DVS) event data. Across three seeds, ISI-CV achieves zero forgetting (AF = 0.000 +/- 0.000) on Split-MNIST and Split-FashionMNIST, near-zero forgetting on Permuted-MNIST (AF = 0.001 +/- 0.000), and the highest accuracy with the lowest forgetting on real neuromorphic DVS data (AA = 0.820 +/- 0.012, AF = 0.221 +/- 0.014). On N-MNIST, gradient-based methods produce unreliable importance estimates and perform worse than no regularization; ISI-CV avoids this failure by design.

49.1LGApr 17
Neuroscience Inspired Graph Operators Towards Edge-Deployable Virtual Sensing for Irregular Geometries

William Howes, Farid Ahmed, Kazuma Kobayashi et al.

Predicting full-field physics through the real-time virtual sensing of engineering systems can enhance limited physical sensors but often requires sparse-to-dense reconstruction, complex multiphysics, and highly irregular geometries as well as strict latency and energy constraints for edge-deployability. Neural operators have been presented as a potential candidate for such applications but few architectures exist that explicitly address power consumption. Spiking neuron integration can provide a potential solution when integrated on neuromorphic hardware but the current existing neuron models result in severe performance degradation towards regression-based virtual sensing. To address the performance concerns and edge-constraints, we present the Variable Spiking Graph Neural Operator (VS-GNO) which integrates a sophisticated spectral-spatial convolutional analysis and a previously developed Variable Spiking Neuron (VSN) and energy-error balance loss function. With a non-spiking $L_2$ error baseline of $0.4\%$, VS-GNO can provide a reconstruction error of $0.71\%$ with $15\%$ average spiking in its spectral-only form and $1.04\%$ with $24.5\%$ spiking in its entire form. These results position VS-GNO as a promising step towards energy-efficient, edge-deployable neural operators for real-time sparse-to-dense virtual sensing in complex, highly irregular engineering environments.

47.2LGMar 23
Adversarial Vulnerabilities in Neural Operator Digital Twins: Gradient-Free Attacks on Nuclear Thermal-Hydraulic Surrogates

Samrendra Roy, Kazuma Kobayashi, Souvik Chakraborty et al.

Operator learning models are rapidly emerging as the predictive core of digital twins for nuclear and energy systems, promising real-time field reconstruction from sparse sensor measurements. Yet their robustness to adversarial perturbations remains uncharacterized, a critical gap for deployment in safety-critical systems. Here we show that neural operators are acutely vulnerable to extremely sparse (fewer than 1% of inputs), physically plausible perturbations that exploit their sensitivity to boundary conditions. Using gradient-free differential evolution across four operator architectures, we demonstrate that minimal modifications trigger catastrophic prediction failures, increasing relative $L_2$ error from $\sim$1.5% (validated accuracy) to 37-63% while remaining completely undetectable by standard validation metrics. Notably, 100% of successful single-point attacks pass z-score anomaly detection. We introduce the effective perturbation dimension $d_{\text{eff}}$, a Jacobian-based diagnostic that, together with sensitivity magnitude, yields a two-factor vulnerability model explaining why architectures with extreme sensitivity concentration (POD-DeepONet, $d_{\text{eff}} \approx 1$) are not necessarily the most exploitable, since low-rank output projections cap maximum error, while moderate concentration with sufficient amplification (S-DeepONet, $d_{\text{eff}} \approx 4$) produces the highest attack success. Gradient-free search outperforms gradient-based alternatives (PGD) on architectures with gradient pathologies, while random perturbations of equal magnitude achieve near-zero success rates, confirming that the discovered vulnerabilities are structural. Our findings expose a previously overlooked attack surface in operator learning models and establish that these models require robustness guarantees beyond standard validation before deployment.

80.1LGApr 2
Graph Neural Operator Towards Edge Deployability and Portability for Sparse-to-Dense, Real-Time Virtual Sensing on Irregular Grids

William Howes, Jason Yoo, Kazuma Kobayashi et al.

Accurate sensing of spatially distributed physical fields typically requires dense instrumentation, which is often infeasible in real-world systems due to cost, accessibility, and environmental constraints. Physics-based solvers address this through direct numerical integration of governing equations, but their computational latency and power requirements preclude real-time use in resource-constrained monitoring and control systems. Here we introduce VIRSO (Virtual Irregular Real-Time Sparse Operator), a graph-based neural operator for sparse-to-dense reconstruction on irregular geometries, and a variable-connectivity algorithm, Variable KNN (V-KNN), for mesh-informed graph construction. Unlike prior neural operators that treat hardware deployability as secondary, VIRSO reframes inference as measurement: the combination of both spectral and spatial analysis provides accurate reconstruction without the high latency and power consumption of previous graph-based methodologies with poor scalability, presenting VIRSO as a potential candidate for edge-constrained, real-time virtual sensing. We evaluate VIRSO on three nuclear thermal-hydraulic benchmarks of increasing geometric and multiphysics complexity, across reconstruction ratios from 47:1 to 156:1. VIRSO achieves mean relative $L_2$ errors below 1%, outperforming other benchmark operators while using fewer parameters. The full 10-layer configuration reduces the energy-delay product (EDP) from ${\approx}206$ J$\cdot$ms for the graph operator baseline to $10.1$ J$\cdot$ms on an NVIDIA H200. Implemented on an NVIDIA Jetson Orin Nano, all configurations of VIRSO provide sub-10 W power consumption and sub-second latency. These results establish the edge-feasibility and hardware-portability of VIRSO and present compute-aware operator learning as a new paradigm for real-time sensing in inaccessible and resource-constrained environments.

CLJan 13
Med-CoReasoner: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning

Fan Gao, Sherry T. Tong, Jiwoong Sohn et al.

While reasoning-enhanced large language models perform strongly on English medical tasks, a persistent multilingual gap remains, with substantially weaker reasoning in local languages, limiting equitable global medical deployment. To bridge this gap, we introduce Med-CoReasoner, a language-informed co-reasoning framework that elicits parallel English and local-language reasoning, abstracts them into structured concepts, and integrates local clinical knowledge into an English logical scaffold via concept-level alignment and retrieval. This design combines the structural robustness of English reasoning with the practice-grounded expertise encoded in local languages. To evaluate multilingual medical reasoning beyond multiple-choice settings, we construct MultiMed-X, a benchmark covering seven languages with expert-annotated long-form question answering and natural language inference tasks, comprising 350 instances per language. Experiments across three benchmarks show that Med-CoReasoner improves multilingual reasoning performance by an average of 5%, with particularly substantial gains in low-resource languages. Moreover, model distillation and expert evaluation analysis further confirm that Med-CoReasoner produces clinically sound and culturally grounded reasoning traces.

LGOct 17, 2024
Virtual Sensing-Enabled Digital Twin Framework for Real-Time Monitoring of Nuclear Systems Leveraging Deep Neural Operators

Raisa Bentay Hossain, Farid Ahmed, Kazuma Kobayashi et al.

Effective real-time monitoring is a foundation of digital twin technology, crucial for detecting material degradation and maintaining the structural integrity of nuclear systems to ensure both safety and operational efficiency. Traditional physical sensor systems face limitations such as installation challenges, high costs, and difficulty measuring critical parameters in hard-to-reach or harsh environments, often resulting in incomplete data coverage. Machine learning-driven virtual sensors, integrated within a digital twin framework, offer a transformative solution by enhancing physical sensor capabilities to monitor critical degradation indicators like pressure, velocity, and turbulence. However, conventional machine learning models struggle with real-time monitoring due to the high-dimensional nature of reactor data and the need for frequent retraining. This paper introduces the use of Deep Operator Networks (DeepONet) as a core component of a digital twin framework to predict key thermal-hydraulic parameters in the hot leg of an AP-1000 Pressurized Water Reactor (PWR). DeepONet serves as a dynamic and scalable virtual sensor by accurately mapping the interplay between operational input parameters and spatially distributed system behaviors. In this study, DeepONet is trained with different operational conditions, which relaxes the requirement of continuous retraining, making it suitable for online and real-time prediction components for digital twin. Our results show that DeepONet achieves accurate predictions with low mean squared error and relative L2 error and can make predictions on unknown data 1400 times faster than traditional CFD simulations. This speed and accuracy enable DeepONet to synchronize with the physical system in real-time, functioning as a dynamic virtual sensor that tracks degradation-contributing conditions.

LGMay 24, 2025
From Proxies to Fields: Spatiotemporal Reconstruction of Global Radiation from Sparse Sensor Sequences

Kazuma Kobayashi, Samrendra Roy, Seid Koric et al.

Accurate reconstruction of latent environmental fields from sparse and indirect observations is a foundational challenge across scientific domains-from atmospheric science and geophysics to public health and aerospace safety. Traditional approaches rely on physics-based simulators or dense sensor networks, both constrained by high computational cost, latency, or limited spatial coverage. We present the Temporal Radiation Operator Network (TRON), a spatiotemporal neural operator architecture designed to infer continuous global scalar fields from sequences of sparse, non-uniform proxy measurements. Unlike recent forecasting models that operate on dense, gridded inputs to predict future states, TRON addresses a more ill-posed inverse problem: reconstructing the current global field from sparse, temporally evolving sensor sequences, without access to future observations or dense labels. Demonstrated on global cosmic radiation dose reconstruction, TRON is trained on 22 years of simulation data and generalizes across 65,341 spatial locations, 8,400 days, and sequence lengths from 7 to 90 days. It achieves sub-second inference with relative L2 errors below 0.1%, representing a >58,000X speedup over Monte Carlo-based estimators. Though evaluated in the context of cosmic radiation, TRON offers a domain-agnostic framework for scientific field reconstruction from sparse data, with applications in atmospheric modeling, geophysical hazard monitoring, and real-time environmental risk forecasting.

LGNov 28, 2024
Virtual Sensing to Enable Real-Time Monitoring of Inaccessible Locations \& Unmeasurable Parameters

Kazuma Kobayashi, Farid Ahmed, Syed Bahauddin Alam

Real-time monitoring of critical parameters is essential for energy systems' safe and efficient operation. However, traditional sensors often fail and degrade in harsh environments where physical sensors cannot be placed (inaccessible locations). In addition, there are important parameters that cannot be directly measured by sensors. We need machine learning (ML)-based real-time monitoring in those remote locations to ensure system operations. However, traditional ML models struggle to process continuous sensor profile data to fit model requirements, leading to the loss of spatial relationships. Another challenge for real-time monitoring is ``dataset shift" and the need for frequent retraining under varying conditions, where extensive retraining prohibits real-time inference. To resolve these challenges, this study addressed the limitations of real-time monitoring methods by enabling monitoring in locations where physical sensors are impractical to deploy. Our proposed approach, utilizing Multi-Input Operator Network virtual sensors, leverages deep learning to seamlessly integrate diverse data sources and accurately predict key parameters in real-time without the need for additional physical sensors. The approach's effectiveness is demonstrated through thermal-hydraulic monitoring in a nuclear reactor subchannel, achieving remarkable accuracy.

IVDec 15, 2023
Can Physician Judgment Enhance Model Trustworthiness? A Case Study on Predicting Pathological Lymph Nodes in Rectal Cancer

Kazuma Kobayashi, Yasuyuki Takamizawa, Mototaka Miyake et al.

Explainability is key to enhancing artificial intelligence's trustworthiness in medicine. However, several issues remain concerning the actual benefit of explainable models for clinical decision-making. Firstly, there is a lack of consensus on an evaluation framework for quantitatively assessing the practical benefits that effective explainability should provide to practitioners. Secondly, physician-centered evaluations of explainability are limited. Thirdly, the utility of built-in attention mechanisms in transformer-based models as an explainability technique is unclear. We hypothesize that superior attention maps should align with the information that physicians focus on, potentially reducing prediction uncertainty and increasing model reliability. We employed a multimodal transformer to predict lymph node metastasis in rectal cancer using clinical data and magnetic resonance imaging, exploring how well attention maps, visualized through a state-of-the-art technique, can achieve agreement with physician understanding. We estimated the model's uncertainty using meta-level information like prediction probability variance and quantified agreement. Our assessment of whether this agreement reduces uncertainty found no significant effect. In conclusion, this case study did not confirm the anticipated benefit of attention maps in enhancing model reliability. Superficial explanations could do more harm than good by misleading physicians into relying on uncertain predictions, suggesting that the current state of attention mechanisms in explainability should not be overestimated. Identifying explainability mechanisms truly beneficial for clinical decision-making remains essential.

LGAug 4, 2025
Rep-GLS: Report-Guided Generalized Label Smoothing for Robust Disease Detection

Kunyu Zhang, Fukang Ge, Binyang Wang et al.

Unlike nature image classification where groundtruth label is explicit and of no doubt, physicians commonly interpret medical image conditioned on certainty like using phrase "probable" or "likely". Existing medical image datasets either simply overlooked the nuance and polarise into binary label. Here, we propose a novel framework that leverages a Large Language Model (LLM) to directly mine medical reports to utilise the uncertainty relevant expression for supervision signal. At first, we collect uncertainty keywords from medical reports. Then, we use Qwen-3 4B to identify the textual uncertainty and map them into an adaptive Generalized Label Smoothing (GLS) rate. This rate allows our model to treat uncertain labels not as errors, but as informative signals, effectively incorporating expert skepticism into the training process. We establish a new clinical expert uncertainty-aware benchmark to rigorously evaluate this problem. Experiments demonstrate that our approach significantly outperforms state-of-the-art methods in medical disease detection. The curated uncertainty words database, code, and benchmark will be made publicly available upon acceptance.

LGOct 20, 2025
Cross-Domain Long-Term Forecasting: Radiation Dose from Sparse Neutron Sensor via Spatio-Temporal Operator Network

Jay Phil Yoo, Kazuma Kobayashi, Souvik Chakraborty et al.

Forecasting unobservable physical quantities from sparse, cross-domain sensor data is a central unsolved problem in scientific machine learning. Existing neural operators and large-scale forecasters rely on dense, co-located input-output fields and short temporal contexts, assumptions that fail in real-world systems where sensing and prediction occur on distinct physical manifolds and over long timescales. We introduce the Spatio-Temporal Operator Network (STONe), a non-autoregressive neural operator that learns a stable functional mapping between heterogeneous domains. By directly inferring high-altitude radiation dose fields from sparse ground-based neutron measurements, STONe demonstrates that operator learning can generalize beyond shared-domain settings. It defines a nonlinear operator between sensor and target manifolds that remains stable over long forecasting horizons without iterative recurrence. This challenges the conventional view that operator learning requires domain alignment or autoregressive propagation. Trained on 23 years of global neutron data, STONe achieves accurate 180-day forecasts with millisecond inference latency. The framework establishes a general principle for cross-domain operator inference, enabling real-time prediction of complex spatiotemporal fields in physics, climate, and energy systems.

LGJul 15, 2025
Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators

Kazuma Kobayashi, Shailesh Garg, Farid Ahmed et al.

Robust uncertainty quantification (UQ) remains a critical barrier to the safe deployment of deep learning in real-time virtual sensing, particularly in high-stakes domains where sparse, noisy, or non-collocated sensor data are the norm. We introduce the Conformalized Monte Carlo Operator (CMCO), a framework that transforms neural operator-based virtual sensing with calibrated, distribution-free prediction intervals. By unifying Monte Carlo dropout with split conformal prediction in a single DeepONet architecture, CMCO achieves spatially resolved uncertainty estimates without retraining, ensembling, or custom loss design. Our method addresses a longstanding challenge: how to endow operator learning with efficient and reliable UQ across heterogeneous domains. Through rigorous evaluation on three distinct applications: turbulent flow, elastoplastic deformation, and global cosmic radiation dose estimation-CMCO consistently attains near-nominal empirical coverage, even in settings with strong spatial gradients and proxy-based sensing. This breakthrough offers a general-purpose, plug-and-play UQ solution for neural operators, unlocking real-time, trustworthy inference in digital twins, sensor fusion, and safety-critical monitoring. By bridging theory and deployment with minimal computational overhead, CMCO establishes a new foundation for scalable, generalizable, and uncertainty-aware scientific machine learning.

LGJul 4, 2025
When Network Architecture Meets Physics: Deep Operator Learning for Coupled Multiphysics

Kazuma Kobayashi, Jaewan Park, Qibang Liu et al.

Scientific applications increasingly demand real-time surrogate models that can capture the behavior of strongly coupled multiphysics systems driven by multiple input functions, such as in thermo-mechanical and electro-thermal processes. While neural operator frameworks, such as Deep Operator Networks (DeepONets), have shown considerable success in single-physics settings, their extension to multiphysics problems remains poorly understood. In particular, the challenge of learning nonlinear interactions between tightly coupled physical fields has received little systematic attention. This study addresses a foundational question: should the architectural design of a neural operator reflect the strength of physical coupling it aims to model? To answer this, we present the first comprehensive, architecture-aware evaluation of DeepONet variants across three regimes: single-physics, weakly coupled, and strongly coupled multiphysics systems. We consider a reaction-diffusion equation with dual spatial inputs, a nonlinear thermo-electrical problem with bidirectional coupling through temperature-dependent conductivity, and a viscoplastic thermo-mechanical model of steel solidification governed by transient phase-driven interactions. Two operator-learning frameworks, the classical DeepONet and its sequential GRU-based extension, S-DeepONet, are benchmarked using both single-branch and multi-branch (MIONet-style) architectures. Our results demonstrate that architectural alignment with physical coupling is crucial: single-branch networks significantly outperform multi-branch counterparts in strongly coupled settings, whereas multi-branch encodings offer advantages for decoupled or single-physics problems. Once trained, these surrogates achieve full-field predictions up to 1.8e4 times faster than high-fidelity finite-element solvers, without compromising solution accuracy.

CVMar 23, 2021
Decomposing Normal and Abnormal Features of Medical Images into Discrete Latent Codes for Content-Based Image Retrieval

Kazuma Kobayashi, Ryuichiro Hataya, Yusuke Kurose et al.

In medical imaging, the characteristics purely derived from a disease should reflect the extent to which abnormal findings deviate from the normal features. Indeed, physicians often need corresponding images without abnormal findings of interest or, conversely, images that contain similar abnormal findings regardless of normal anatomical context. This is called comparative diagnostic reading of medical images, which is essential for a correct diagnosis. To support comparative diagnostic reading, content-based image retrieval (CBIR), which can selectively utilize normal and abnormal features in medical images as two separable semantic components, will be useful. Therefore, we propose a neural network architecture to decompose the semantic components of medical images into two latent codes: normal anatomy code and abnormal anatomy code. The normal anatomy code represents normal anatomies that should have existed if the sample is healthy, whereas the abnormal anatomy code attributes to abnormal changes that reflect deviation from the normal baseline. These latent codes are discretized through vector quantization to enable binary hashing, which can reduce the computational burden at the time of similarity search. By calculating the similarity based on either normal or abnormal anatomy codes or the combination of the two codes, our algorithm can retrieve images according to the selected semantic component from a dataset consisting of brain magnetic resonance images of gliomas. Our CBIR system qualitatively and quantitatively achieves remarkable results.

IVNov 12, 2020
Decomposing Normal and Abnormal Features of Medical Images for Content-based Image Retrieval

Kazuma Kobayashi, Ryuichiro Hataya, Yusuke Kurose et al.

Medical images can be decomposed into normal and abnormal features, which is considered as the compositionality. Based on this idea, we propose an encoder-decoder network to decompose a medical image into two discrete latent codes: a normal anatomy code and an abnormal anatomy code. Using these latent codes, we demonstrate a similarity retrieval by focusing on either normal or abnormal features of medical images.

IVMay 26, 2020
Learning Global and Local Features of Normal Brain Anatomy for Unsupervised Abnormality Detection

Kazuma Kobayashi, Ryuichiro Hataya, Yusuke Kurose et al.

In real-world clinical practice, overlooking unanticipated findings can result in serious consequences. However, supervised learning, which is the foundation for the current success of deep learning, only encourages models to identify abnormalities that are defined in datasets in advance. Therefore, abnormality detection must be implemented in medical images that are not limited to a specific disease category. In this study, we demonstrate an unsupervised learning framework for pixel-wise abnormality detection in brain magnetic resonance imaging captured from a patient population with metastatic brain tumor. Our concept is as follows: If an image reconstruction network can faithfully reproduce the global features of normal anatomy, then the abnormal lesions in unseen images can be identified based on the local difference from those reconstructed as normal by a discriminative network. Both networks are trained on a dataset comprising only normal images without labels. In addition, we devise a metric to evaluate the anatomical fidelity of the reconstructed images and confirm that the overall detection performance is improved when the image reconstruction network achieves a higher score. For evaluation, clinically significant abnormalities are comprehensively segmented. The results show that the area under the receiver operating characteristics curve values for metastatic brain tumors, extracranial metastatic tumors, postoperative cavities, and structural changes are 0.78, 0.61, 0.91, and 0.60, respectively.