COMP-PHApr 25, 2023
Unsupervised Discovery of Extreme Weather Events Using Universal Representations of Emergent OrganizationAdam Rupe, Karthik Kashinath, Nalini Kumar et al.
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice. Its building blocks are spacetime lightcones that embody how information propagates across a system through local interactions. We show that predictive equivalence classes of lightcones -- local causal states -- capture organized behaviors and coherent structures in complex spatiotemporal systems. Employing an unsupervised physics-informed machine learning algorithm and a high-performance computing implementation, we demonstrate automatically discovering coherent structures in two real world domain science problems. We show that local causal states identify vortices and track their power-law decay behavior in two-dimensional fluid turbulence. We then show how to detect and track familiar extreme weather events -- hurricanes and atmospheric rivers -- and discover other novel coherent structures associated with precipitation extremes in high-resolution climate data at the grid-cell level.
COMP-PHSep 25, 2019Code
DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal SystemsAdam Rupe, Nalini Kumar, Vladislav Epifanov et al.
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems has lagged far behind theoretical development. We present our first step towards bridging this divide - DisCo - a high-performance distributed workflow for the behavior-driven local causal state theory. DisCo provides a scalable unsupervised physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by the latent local causal state variables. Complex spatiotemporal systems are generally highly structured and organize around a lower-dimensional skeleton of coherent structures, and in several firsts we demonstrate the efficacy of DisCo in capturing such structures from observational and simulated scientific data. To the best of our knowledge, DisCo is also the first application software developed entirely in Python to scale to over 1000 machine nodes, providing good performance along with ensuring domain scientists' productivity. We developed scalable, performant methods optimized for Intel many-core processors that will be upstreamed to open-source Python library packages. Our capstone experiment, using newly developed DisCo workflow and libraries, performs unsupervised spacetime segmentation analysis of CAM5.1 climate simulation data, processing an unprecedented 89.5 TB in 6.6 minutes end-to-end using 1024 Intel Haswell nodes on the Cori supercomputer obtaining 91% weak-scaling and 64% strong-scaling efficiency.
COMP-PHSep 16, 2019
Towards Unsupervised Segmentation of Extreme Weather EventsAdam Rupe, Karthik Kashinath, Nalini Kumar et al.
Extreme weather is one of the main mechanisms through which climate change will directly impact human society. Coping with such change as a global community requires markedly improved understanding of how global warming drives extreme weather events. While alternative climate scenarios can be simulated using sophisticated models, identifying extreme weather events in these simulations requires automation due to the vast amounts of complex high-dimensional data produced. Atmospheric dynamics, and hydrodynamic flows more generally, are highly structured and largely organize around a lower dimensional skeleton of coherent structures. Indeed, extreme weather events are a special case of more general hydrodynamic coherent structures. We present a scalable physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by latent variables known as local causal states. For complex fluid flows we show our method is capable of capturing known coherent structures, and with promising segmentation results on CAM5.1 water vapor data we outline the path to extreme weather identification from unlabeled climate model simulation data.
COAug 14, 2018
CosmoFlow: Using Deep Learning to Learn the Universe at ScaleAmrita Mathuriya, Deborah Bard, Peter Mendygral et al.
Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters $Ω_M$, $σ_8$ and n$_s$ with unprecedented accuracy.