MemStream: Memory-Based Streaming Anomaly Detection
This addresses the need for data-efficient, online anomaly detection in real-world streaming scenarios where concept drift occurs, representing an incremental improvement over existing methods.
The authors tackled the problem of detecting anomalies in streaming data with concept drift by proposing MemStream, a framework that uses a denoising autoencoder and memory module to adapt online without labels, achieving effectiveness demonstrated on 2 synthetic and 11 real-world datasets.
Given a stream of entries over time in a multi-dimensional data setting where concept drift is present, how can we detect anomalous activities? Most of the existing unsupervised anomaly detection approaches seek to detect anomalous events in an offline fashion and require a large amount of data for training. This is not practical in real-life scenarios where we receive the data in a streaming manner and do not know the size of the stream beforehand. Thus, we need a data-efficient method that can detect and adapt to changing data trends, or concept drift, in an online manner. In this work, we propose MemStream, a streaming anomaly detection framework, allowing us to detect unusual events as they occur while being resilient to concept drift. We leverage the power of a denoising autoencoder to learn representations and a memory module to learn the dynamically changing trend in data without the need for labels. We prove the optimum memory size required for effective drift handling. Furthermore, MemStream makes use of two architecture design choices to be robust to memory poisoning. Experimental results show the effectiveness of our approach compared to state-of-the-art streaming baselines using $2$ synthetic datasets and $11$ real-world datasets.