Diagnostic Runtime Monitoring with Martingales
This addresses the need for robust monitoring in safety-critical robotics to prevent system failures by identifying causes of distribution shifts for targeted interventions.
The paper tackles the problem of diagnosing distribution shifts in safety-critical robotics by deploying multiple stochastic martingales in a streaming fashion, resulting in improved speed, accuracy, and flexibility compared to existing methods, as validated in simulated and live hardware settings.
Machine learning systems deployed in safety-critical robotics settings must be robust to distribution shifts. However, system designers must understand the cause of a distribution shift in order to implement the appropriate intervention or mitigation strategy and prevent system failure. In this paper, we present a novel framework for diagnosing distribution shifts in a streaming fashion by deploying multiple stochastic martingales simultaneously. We show that knowledge of the underlying cause of a distribution shift can lead to proper interventions over the lifecycle of a deployed system. Our experimental framework can easily be adapted to different types of distribution shifts, models, and datasets. We find that our method outperforms existing work on diagnosing distribution shifts in terms of speed, accuracy, and flexibility, and validate the efficiency of our model in both simulated and live hardware settings.