Detecting Distributional Differences in Labeled Sequence Data with Application to Tropical Cyclone Satellite Imagery
This work addresses the challenge of forecasting rapid intensification in tropical cyclones, which is critical for disaster preparedness, though it is incremental as it builds on existing statistical and neural network methods.
The authors tackled the problem of detecting distributional differences in tropical cyclone satellite imagery to predict rapid intensity changes, proposing a nonparametric test that identifies archetypes of infrared imagery associated with elevated risk, such as deep or deepening core convection over time.
Our goal is to quantify whether, and if so how, spatio-temporal patterns in tropical cyclone (TC) satellite imagery signal an upcoming rapid intensity change event. To address this question, we propose a new nonparametric test of association between a time series of images and a series of binary event labels. We ask whether there is a difference in distribution between (dependent but identically distributed) 24-h sequences of images preceding an event versus a non-event. By rewriting the statistical test as a regression problem, we leverage neural networks to infer modes of structural evolution of TC convection that are representative of the lead-up to rapid intensity change events. Dependencies between nearby sequences are handled by a bootstrap procedure that estimates the marginal distribution of the label series. We prove that type I error control is guaranteed as long as the distribution of the label series is well-estimated, which is made easier by the extensive historical data for binary TC event labels. We show empirical evidence that our proposed method identifies archetypes of infrared imagery associated with elevated rapid intensification risk, typically marked by deep or deepening core convection over time. Such results provide a foundation for improved forecasts of rapid intensification.