LGMLAug 12, 2019

Automatic Model Monitoring for Data Streams

arXiv:1908.04240v121 citations
AI Analysis

It addresses a specific issue in production systems like fraud detection, but it is incremental as it builds on existing concept drift detection methods.

The paper tackles the problem of detecting concept drift in data streams when labels are unavailable and automatically generating explanations for drift causes, proposing SAMM, an automatic model monitoring system. Results show SAMM detected anomalous events considered useful by domain experts in fraud detection datasets with over 22 million transactions.

Detecting concept drift is a well known problem that affects production systems. However, two important issues that are frequently not addressed in the literature are 1) the detection of drift when the labels are not immediately available; and 2) the automatic generation of explanations to identify possible causes for the drift. For example, a fraud detection model in online payments could show a drift due to a hot sale item (with an increase in false positives) or due to a true fraud attack (with an increase in false negatives) before labels are available. In this paper we propose SAMM, an automatic model monitoring system for data streams. SAMM detects concept drift using a time and space efficient unsupervised streaming algorithm and it generates alarm reports with a summary of the events and features that are important to explain it. SAMM was evaluated in five real world fraud detection datasets, each spanning periods up to eight months and totaling more than 22 million online transactions. We evaluated SAMM using human feedback from domain experts, by sending them 100 reports generated by the system. Our results show that SAMM is able to detect anomalous events in a model life cycle that are considered useful by the domain experts. Given these results, SAMM will be rolled out in a next version of Feedzai's Fraud Detection solution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes