Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models
This addresses the need for reliable model monitoring in high-stakes industry settings, though it is incremental as it builds on existing monitoring concepts within a specific platform.
The paper tackles the problem of ensuring machine learning model performance after deployment by introducing Amazon SageMaker Model Monitor, a fully managed service that automatically detects data, concept, bias, and feature attribution drift in real-time, with evaluations and insights from over two years of production deployment.
With the increasing adoption of machine learning (ML) models and systems in high-stakes settings across different industries, guaranteeing a model's performance after deployment has become crucial. Monitoring models in production is a critical aspect of ensuring their continued performance and reliability. We present Amazon SageMaker Model Monitor, a fully managed service that continuously monitors the quality of machine learning models hosted on Amazon SageMaker. Our system automatically detects data, concept, bias, and feature attribution drift in models in real-time and provides alerts so that model owners can take corrective actions and thereby maintain high quality models. We describe the key requirements obtained from customers, system design and architecture, and methodology for detecting different types of drift. Further, we provide quantitative evaluations followed by use cases, insights, and lessons learned from more than two years of production deployment.