Temporal Detection of Anomalies via Actor-Critic Based Controlled Sensing
This addresses anomaly detection in monitoring systems, but it is incremental as it applies existing actor-critic methods to a controlled sensing problem.
The paper tackles the problem of detecting when the number of anomalies in binary stochastic processes exceeds a threshold by using a Bayesian and reinforcement learning approach for sequential decision-making, demonstrating superior performance over traditional model-based algorithms in numerical experiments.
We address the problem of monitoring a set of binary stochastic processes and generating an alert when the number of anomalies among them exceeds a threshold. For this, the decision-maker selects and probes a subset of the processes to obtain noisy estimates of their states (normal or anomalous). Based on the received observations, the decisionmaker first determines whether to declare that the number of anomalies has exceeded the threshold or to continue taking observations. When the decision is to continue, it then decides whether to collect observations at the next time instant or defer it to a later time. If it chooses to collect observations, it further determines the subset of processes to be probed. To devise this three-step sequential decision-making process, we use a Bayesian formulation wherein we learn the posterior probability on the states of the processes. Using the posterior probability, we construct a Markov decision process and solve it using deep actor-critic reinforcement learning. Via numerical experiments, we demonstrate the superior performance of our algorithm compared to the traditional model-based algorithms.