LGAISep 21, 2021

A Distance-based Anomaly Detection Framework for Deep Reinforcement Learning

arXiv:2109.09889v39 citations
Originality Incremental advance
AI Analysis

This work addresses safety and reliability issues for deploying deep reinforcement learning systems in real-world applications, representing an incremental improvement by extending existing distance-based methods to a unified framework.

The paper tackles the problem of abnormal states in deep reinforcement learning systems, which can cause unpredictable behaviors and unsafe actions, by proposing a Mahalanobis distance-based anomaly detection framework called MDX that addresses random, adversarial, and out-of-distribution state outliers in offline and online settings, demonstrating its effectiveness through experiments on classical control environments, Atari games, and autonomous driving scenarios.

In deep reinforcement learning (RL) systems, abnormal states pose significant risks by potentially triggering unpredictable behaviors and unsafe actions, thus impeding the deployment of RL systems in real-world scenarios. It is crucial for reliable decision-making systems to have the capability to cast an alert whenever they encounter unfamiliar observations that they are not equipped to handle. In this paper, we propose a novel Mahalanobis distance-based (MD) anomaly detection framework, called \textit{MDX}, for deep RL algorithms. MDX simultaneously addresses random, adversarial, and out-of-distribution (OOD) state outliers in both offline and online settings. It utilizes Mahalanobis distance within class-conditional distributions for each action and operates within a statistical hypothesis testing framework under the Gaussian assumption. We further extend it to robust and distribution-free versions by incorporating Robust MD and conformal inference techniques. Through extensive experiments on classical control environments, Atari games, and autonomous driving scenarios, we demonstrate the effectiveness of our MD-based detection framework. MDX offers a simple, unified, and practical anomaly detection tool for enhancing the safety and reliability of RL systems in real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes