Near-optimal irrevocable sample selection for periodic data streams with applications to marine robotics
This addresses the challenge of real-time monitoring with limited resources in domains like robotics and environmental sensing, where existing methods fail due to spatiotemporal structure, offering a solution with theoretical guarantees.
The paper tackles the problem of irrevocable sample selection from periodic data streams, such as in marine robotics, by introducing a novel algorithm that recovers near-optimal sample sets according to monotone submodular utility functions. It demonstrates effectiveness on a seven-year environmental dataset, selecting phytoplankton sample locations that are nearly optimal for predicting concentrations in unsampled areas.
We consider the task of monitoring spatiotemporal phenomena in real-time by deploying limited sampling resources at locations of interest irrevocably and without knowledge of future observations. This task can be modeled as an instance of the classical secretary problem. Although this problem has been studied extensively in theoretical domains, existing algorithms require that data arrive in random order to provide performance guarantees. These algorithms will perform arbitrarily poorly on data streams such as those encountered in robotics and environmental monitoring domains, which tend to have spatiotemporal structure. We focus on the problem of selecting representative samples from phenomena with periodic structure and introduce a novel sample selection algorithm that recovers a near-optimal sample set according to any monotone submodular utility function. We evaluate our algorithm on a seven-year environmental dataset collected at the Martha's Vineyard Coastal Observatory and show that it selects phytoplankton sample locations that are nearly optimal in an information-theoretic sense for predicting phytoplankton concentrations in locations that were not directly sampled. The proposed periodic secretary algorithm can be used with theoretical performance guarantees in many real-time sensing and robotics applications for streaming, irrevocable sample selection from periodic data streams.