LGSep 21, 2022
Adaptive Bias Correction for Improved Subseasonal ForecastingSoukayna Mouatadid, Paulo Orenstein, Genevieve Flaspohler et al.
Subseasonal forecasting -- predicting temperature and precipitation 2 to 6 weeks ahead -- is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remain poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. Here, to counter these errors, we introduce an adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. We show that, when applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60-90% (over baseline skills of 0.18-0.25) and precipitation forecasting skill by 40-69% (over baseline skills of 0.11-0.15) in the contiguous U.S. We couple these performance improvements with a practical workflow to explain ABC skill gains and identify higher-skill windows of opportunity based on specific climate conditions.
28.8LGApr 17
Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias CorrectionHannah Guan, Soukayna Mouatadid, Paulo Orenstein et al.
Decision-makers rely on weather forecasts to plant crops, manage wildfires, allocate water and energy, and prepare for weather extremes. Today, such forecasts enjoy unprecedented accuracy out to two weeks thanks to steady advances in physics-based dynamical models and data-driven artificial intelligence (AI) models. However, model skill drops precipitously at subseasonal timescales (2 - 6 weeks ahead), due to compounding errors and persistent biases. To counter this degradation, we introduce probabilistic bias correction (PBC), a machine learning framework that substantially reduces systematic error by learning to correct historical probabilistic forecasts. When applied to the leading dynamical and AI models from the European Centre for Medium-Range Weather Forecasts (ECMWF), PBC doubles the subseasonal skill of the AI Forecasting System and improves the skill of the operationally-debiased dynamical model for 91% of pressure, 92% of temperature, and 98% of precipitation targets. We designed PBC for operational deployment, and, in ECMWF's 2025 real-time forecasting competition, its global forecasts placed first for all weather variables and lead times, outperforming the dynamical models from six operational forecasting centers, an international dynamical multi-model ensemble, ECMWF's AI Forecasting System, and the forecasting systems of 34 teams worldwide. These probabilistic skill gains translate into more accurate prediction of extreme events and have the potential to improve agricultural planning, energy management, and disaster preparedness in vulnerable communities.
AO-PHSep 21, 2021Code
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and BenchmarkingSoukayna Mouatadid, Paulo Orenstein, Genevieve Flaspohler et al.
Subseasonal forecasting of the weather two to six weeks in advance is critical for resource allocation and advance disaster notice but poses many challenges for the forecasting community. At this forecast horizon, physics-based dynamical models have limited skill, and the targets for prediction depend in a complex manner on both local weather variables and global climate variables. Recently, machine learning methods have shown promise in advancing the state of the art but only at the cost of complex data curation, integrating expert knowledge with aggregation across multiple relevant data sources, file formats, and temporal and spatial resolutions. To streamline this process and accelerate future development, we introduce SubseasonalClimateUSA, a curated dataset for training and benchmarking subseasonal forecasting models in the United States. We use this dataset to benchmark a diverse suite of models, including operational dynamical models, classical meteorological baselines, and ten state-of-the-art machine learning and deep learning-based methods from the literature. Overall, our benchmarks suggest simple and effective ways to extend the accuracy of current operational models. SubseasonalClimateUSA is regularly updated and accessible via the https://github.com/microsoft/subseasonal_data/ Python package.
LGJun 13, 2021
Online Learning with Optimism and DelayGenevieve Flaspohler, Francesco Orabona, Judah Cohen et al.
Inspired by the demands of real-time climate and weather forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online learning to optimistic online learning that reveals how optimistic hints can mitigate the regret penalty caused by delay. We pair this delay-as-optimism perspective with a new analysis of optimistic learning that exposes its robustness to hinting errors and a new meta-algorithm for learning effective hinting strategies in the presence of delay. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models.
ROSep 26, 2019
Information-Guided Robotic Maximum Seek-and-Sample in Partially Observable Continuous EnvironmentsGenevieve Flaspohler, Victoria Preston, Anna P. M. Michel et al.
We present PLUMES, a planner to localizing and collecting samples at the global maximum of an a priori unknown and partially observable continuous environment. The "maximum-seek-and-sample" (MSS) problem is pervasive in the environmental and earth sciences. Experts want to collect scientifically valuable samples at an environmental maximum (e.g., an oil-spill source), but do not have prior knowledge about the phenomenon's distribution. We formulate the MSS problem as a partially-observable Markov decision process (POMDP) with continuous state and observation spaces, and a sparse reward signal. To solve the MSS POMDP, PLUMES uses an information-theoretic reward heuristic with continous-observation Monte Carlo Tree Search to efficiently localize and sample from the global maximum. In simulation and field experiments, PLUMES collects more scientifically valuable samples than state-of-the-art planners in a diverse set of environments, with various platforms, sensors, and challenging real-world conditions.
ROMar 7, 2019
Streaming Scene Maps for Co-Robotic Exploration in Bandwidth Limited EnvironmentsYogesh Girdhar, Levi Cai, Stewart Jamieson et al.
This paper proposes a bandwidth tunable technique for real-time probabilistic scene modeling and mapping to enable co-robotic exploration in communication constrained environments such as the deep sea. The parameters of the system enable the user to characterize the scene complexity represented by the map, which in turn determines the bandwidth requirements. The approach is demonstrated using an underwater robot that learns an unsupervised scene model of the environment and then uses this scene model to communicate the spatial distribution of various high-level semantic scene constructs to a human operator. Preliminary experiments in an artificially constructed tank environment as well as simulated missions over a 10m$\times$10m coral reef using real data show the tunability of the maps to different bandwidth constraints and science interests. To our knowledge this is the first paper to quantify how the free parameters of the unsupervised scene model impact both the scientific utility of and bandwidth required to communicate the resulting scene model.
LGNov 30, 2017
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic modelsGenevieve Flaspohler, Nicholas Roy, Yogesh Girdhar
The gap between our ability to collect interesting data and our ability to analyze these data is growing at an unprecedented rate. Recent algorithmic attempts to fill this gap have employed unsupervised tools to discover structure in data. Some of the most successful approaches have used probabilistic models to uncover latent thematic structure in discrete data. Despite the success of these models on textual data, they have not generalized as well to image data, in part because of the spatial and temporal structure that may exist in an image stream. We introduce a novel unsupervised machine learning framework that incorporates the ability of convolutional autoencoders to discover features from images that directly encode spatial information, within a Bayesian nonparametric topic model that discovers meaningful latent patterns within discrete data. By using this hybrid framework, we overcome the fundamental dependency of traditional topic models on rigidly hand-coded data representations, while simultaneously encoding spatial dependency in our topics without adding model complexity. We apply this model to the motivating application of high-level scene understanding and mission summarization for exploratory marine robots. Our experiments on a seafloor dataset collected by a marine robot show that the proposed hybrid framework outperforms current state-of-the-art approaches on the task of unsupervised seafloor terrain characterization.
RONov 29, 2017
Near-optimal irrevocable sample selection for periodic data streams with applications to marine roboticsGenevieve Flaspohler, Nicholas Roy, Yogesh Girdhar
We consider the task of monitoring spatiotemporal phenomena in real-time by deploying limited sampling resources at locations of interest irrevocably and without knowledge of future observations. This task can be modeled as an instance of the classical secretary problem. Although this problem has been studied extensively in theoretical domains, existing algorithms require that data arrive in random order to provide performance guarantees. These algorithms will perform arbitrarily poorly on data streams such as those encountered in robotics and environmental monitoring domains, which tend to have spatiotemporal structure. We focus on the problem of selecting representative samples from phenomena with periodic structure and introduce a novel sample selection algorithm that recovers a near-optimal sample set according to any monotone submodular utility function. We evaluate our algorithm on a seven-year environmental dataset collected at the Martha's Vineyard Coastal Observatory and show that it selects phytoplankton sample locations that are nearly optimal in an information-theoretic sense for predicting phytoplankton concentrations in locations that were not directly sampled. The proposed periodic secretary algorithm can be used with theoretical performance guarantees in many real-time sensing and robotics applications for streaming, irrevocable sample selection from periodic data streams.