LGCRAPCOMar 15, 2022

Practical data monitoring in the internet-services domain

arXiv:2203.08067v2h-index: 3
AI Analysis

This addresses the challenge of reducing false alarms for internet-service companies, though it appears incremental as it builds on existing anomaly detection methods.

The paper tackles the problem of high false alarm rates in large-scale metric monitoring for internet services, presenting a framework that significantly improves accuracy and enables interpretable models.

Large-scale monitoring, anomaly detection, and root cause analysis of metrics are essential requirements of the internet-services industry. To address the need to continuously monitor millions of metrics, many anomaly detection approaches are being used on a daily basis by large internet-based companies. However, in spite of the significant progress made to accurately and efficiently detect anomalies in metrics, the sheer scale of the number of metrics has meant there are still a large number of false alarms that need to be investigated. This paper presents a framework for reliable large-scale anomaly detection. It is significantly more accurate than existing approaches and allows for easy interpretation of models, thus enabling practical data monitoring in the internet-services domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes