LGSYAPOct 17, 2023

Data Drift Monitoring for Log Anomaly Detection Pipelines

arXiv:2310.14893v11 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the need for automated drift monitoring in LAD pipelines to assist site reliability engineers in maintaining system reliability, though it is incremental as it builds on existing drift detection concepts.

The paper tackles the problem of log patterns changing over time in Log Anomaly Detection (LAD) pipelines, which can degrade model performance, by introducing a Bayes Factor-based drift detection method to identify when model retraining is needed, demonstrating it on real and simulated log data.

Logs enable the monitoring of infrastructure status and the performance of associated applications. Logs are also invaluable for diagnosing the root causes of any problems that may arise. Log Anomaly Detection (LAD) pipelines automate the detection of anomalies in logs, providing assistance to site reliability engineers (SREs) in system diagnosis. Log patterns change over time, necessitating updates to the LAD model defining the `normal' log activity profile. In this paper, we introduce a Bayes Factor-based drift detection method that identifies when intervention, retraining, and updating of the LAD model are required with human involvement. We illustrate our method using sequences of log activity, both from unaltered data, and simulated activity with controlled levels of anomaly contamination, based on real collected log data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes