SPFeb 23Code
Fast Spectrogram Event Extraction via Offline Self-Supervised Learning: From Fusion Diagnostics to BioacousticsNathaniel Chen, Kouroche Bouchiat, Peter Steiner et al.
Next-generation fusion facilities like ITER face a "data deluge," generating petabytes of multi-diagnostic signals daily that challenge manual analysis. We present a "signals-first" self-supervised framework for the automated extraction of coherent and transient modes from high-noise time-frequency data. We also develop a general-purpose method and tool for extracting coherent, quasi-coherent, and transient modes for fluctuation measurements in tokamaks by employing non-linear optimal techniques in multichannel signal processing with a fast neural network surrogate on fast magnetics, electron cyclotron emission, CO2 interferometers, and beam emission spectroscopy measurements from DIII-D. Results are tested on data from DIII-D, TJ-II, and non-fusion spectrograms. With an inference latency of 0.5 seconds, this framework enables real-time mode identification and large-scale automated database generation for advanced plasma control. Repository is in https://github.com/PlasmaControl/TokEye.
LGJan 30
Beyond the Loss Curve: Scaling Laws, Active Learning, and the Limits of Learning from Exact PosteriorsArian Khorasani, Nathaniel Chen, Yug D Oswal et al.
How close are neural networks to the best they could possibly do? Standard benchmarks cannot answer this because they lack access to the true posterior p(y|x). We use class-conditional normalizing flows as oracles that make exact posteriors tractable on realistic images (AFHQ, ImageNet). This enables five lines of investigation. Scaling laws: Prediction error decomposes into irreducible aleatoric uncertainty and reducible epistemic error; the epistemic component follows a power law in dataset size, continuing to shrink even when total loss plateaus. Limits of learning: The aleatoric floor is exactly measurable, and architectures differ markedly in how they approach it: ResNets exhibit clean power-law scaling while Vision Transformers stall in low-data regimes. Soft labels: Oracle posteriors contain learnable structure beyond class labels: training with exact posteriors outperforms hard labels and yields near-perfect calibration. Distribution shift: The oracle computes exact KL divergence of controlled perturbations, revealing that shift type matters more than shift magnitude: class imbalance barely affects accuracy at divergence values where input noise causes catastrophic degradation. Active learning: Exact epistemic uncertainty distinguishes genuinely informative samples from inherently ambiguous ones, improving sample efficiency. Our framework reveals that standard metrics hide ongoing learning, mask architectural differences, and cannot diagnose the nature of distribution shift.
COMP-PHSep 10, 2025Code
PCGBandit: One-shot acceleration of transient PDE solvers via online-learned preconditionersMikhail Khodak, Min Ki Jung, Brian Wynne et al.
Data-driven acceleration of scientific computing workflows has been a high-profile aim of machine learning (ML) for science, with numerical simulation of transient partial differential equations (PDEs) being one of the main applications. The focus thus far has been on methods that require classical simulations to train, which when combined with the data-hungriness and optimization challenges of neural networks has caused difficulties in demonstrating a convincing advantage against strong classical baselines. We consider an alternative paradigm in which the learner uses a classical solver's own data to accelerate it, enabling a one-shot speedup of the simulation. Concretely, since transient PDEs often require solving a sequence of related linear systems, the feedback from repeated calls to a linear solver such as preconditioned conjugate gradient (PCG) can be used by a bandit algorithm to online-learn an adaptive sequence of solver configurations (e.g. preconditioners). The method we develop, PCGBandit, is implemented directly on top of the popular open source software OpenFOAM, which we use to show its effectiveness on a set of fluid and magnetohydrodynamics (MHD) problems.
50.4LGMay 7
Offline Reinforcement Learning for Rotation Profile Control in TokamaksRohit Sonker, Hiro Josep Farre Kaga, Jiayu Chen et al.
Tokamaks remain leading candidates for achieving practical fusion energy, yet many important control problems inside these devices are still difficult or unsolved. One such challenge is controlling the plasma rotation profile, which strongly influences stability, confinement, and transport. While the average rotation can be controlled, controlling the full profile is challenging due to high dimensionality, response to multiple actuators and dependence on plasma condition. Learning-based control methods, such as reinforcement learning (RL), provide a potential solution to this challenging problem with ability to model complex interactions leading to effective multi-input multi-output control. However, learning such policies is challenging due to the lack of accurate simulators that can model the rotation profile dynamics. In this work, we investigate the use of offline RL and offline model-based RL algorithms for rotation profile control, training them solely on historical data from the DIII-D tokamak. Our final method uses probabilistic models of plasma dynamics to generate rollouts for RL training. We deploy this policy on the DIII-D Tokamak and observe promising real-world results. We conclude by highlighting key challenges and insights from training and deploying an RL policy on a complex physical device while using only limited past data.
PLASM-PHApr 18, 2024
Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent NetworksIan Char, Youngseog Chung, Joseph Abbate et al.
Although tokamaks are one of the most promising devices for realizing nuclear fusion as an energy source, there are still key obstacles when it comes to understanding the dynamics of the plasma and controlling it. As such, it is crucial that high quality models are developed to assist in overcoming these obstacles. In this work, we take an entirely data driven approach to learn such a model. In particular, we use historical data from the DIII-D tokamak to train a deep recurrent network that is able to predict the full time evolution of plasma discharges (or "shots"). Following this, we investigate how different training and inference procedures affect the quality and calibration of the shot predictions.
LGJun 21, 2025
Regulation Compliant AI for Fusion: Real-Time Image Analysis-Based Control of Divertor Detachment in TokamaksNathaniel Chen, Cheolsik Byun, Azarakash Jalalvand et al.
While artificial intelligence (AI) has been promising for fusion control, its inherent black-box nature will make compliant implementation in regulatory environments a challenge. This study implements and validates a real-time AI enabled linear and interpretable control system for successful divertor detachment control with the DIII-D lower divertor camera. Using D2 gas, we demonstrate feedback divertor detachment control with a mean absolute difference of 2% from the target for both detachment and reattachment. This automatic training and linear processing framework can be extended to any image based diagnostic for regulatory compliant controller necessary for future fusion reactors.
PLASM-PHMay 9, 2024
Multimodal Super-Resolution: Discovering hidden physics and its application to fusion plasmasAzarakhsh Jalalvand, SangKyeun Kim, Jaemin Seo et al.
A non-linear system governed by multi-spatial and multi-temporal physics scales cannot be fully understood with a single diagnostic, as each provides only a partial view, leading to information loss. Combining multiple diagnostics may also result in incomplete projections of the system's physics. By identifying hidden inter-correlations between diagnostics, we can leverage mutual support to fill in these gaps, but uncovering such correlations analytically is too complex. We introduce a machine learning methodology to address this issue. Unlike traditional methods, our multimodal approach does not rely on the target diagnostic's direct measurements to generate its super-resolution version. Instead, it uses other diagnostics to produce super-resolution data, capturing detailed structural evolution and responses to perturbations previously unobservable. This not only enhances the resolution of a diagnostic for deeper insights but also reconstructs the target diagnostic, providing a valuable tool to mitigate diagnostic failure. This methodology addresses a key challenge in fusion plasmas: the Edge Localized Mode (ELM), a plasma instability that can cause significant erosion of plasma-facing materials. A method to stabilize ELM is using resonant magnetic perturbation (RMP) to trigger magnetic islands. However, limited spatial and temporal resolution restricts analysis of these islands due to their small size, rapid dynamics, and complex plasma interactions. With super-resolution diagnostics, we can experimentally verify theoretical models of magnetic islands for the first time, providing insights into their role in ELM stabilization. This advancement supports the development of effective ELM suppression strategies for future fusion reactors like ITER and has broader applications, potentially revolutionizing diagnostics in fields such as astronomy, astrophysics, and medical imaging.
LGJun 23, 2020
Neural Dynamical Systems: Balancing Structure and Flexibility in Physical PredictionViraj Mehta, Ian Char, Willie Neiswanger et al.
We introduce Neural Dynamical Systems (NDS), a method of learning dynamical models in various gray-box settings which incorporates prior knowledge in the form of systems of ordinary differential equations. NDS uses neural networks to estimate free parameters of the system, predicts residual terms, and numerically integrates over time to predict future states. A key insight is that many real dynamical systems of interest are hard to model because the dynamics may vary across rollouts. We mitigate this problem by taking a trajectory of prior states as the input to NDS and train it to dynamically estimate system parameters using the preceding trajectory. We find that NDS learns dynamics with higher accuracy and fewer samples than a variety of deep learning methods that do not incorporate the prior knowledge and methods from the system identification literature which do. We demonstrate these advantages first on synthetic dynamical systems and then on real data captured from deuterium shots from a nuclear fusion reactor. Finally, we demonstrate that these benefits can be utilized for control in small-scale experiments.
LGJan 6, 2020
Offline Contextual Bayesian Optimization for Nuclear FusionYoungseog Chung, Ian Char, Willie Neiswanger et al.
Nuclear fusion is regarded as the energy of the future since it presents the possibility of unlimited clean energy. One obstacle in utilizing fusion as a feasible energy source is the stability of the reaction. Ideally, one would have a controller for the reactor that makes actions in response to the current state of the plasma in order to prolong the reaction as long as possible. In this work, we make preliminary steps to learning such a controller. Since learning on a real world reactor is infeasible, we tackle this problem by attempting to learn optimal controls offline via a simulator, where the state of the plasma can be explicitly set. In particular, we introduce a theoretically grounded Bayesian optimization algorithm that recommends a state and action pair to evaluate at every iteration and show that this results in more efficient use of the simulator.