MPP: Model Performance Predictor
This addresses a key operational challenge for teams deploying ML pipelines in production, though it is incremental as it builds on existing monitoring concepts.
The paper tackles the problem of monitoring machine learning model performance in production without ground truth labels by proposing the Model Performance Predictor (MPP), an ML algorithm that uses an ensemble of metrics to create a score for prediction quality, enabling automated alerts and scalable deployments.
Operations is a key challenge in the domain of machine learning pipeline deployments involving monitoring and management of real-time prediction quality. Typically, metrics like accuracy, RMSE etc., are used to track the performance of models in deployment. However, these metrics cannot be calculated in production due to the absence of labels. We propose using an ML algorithm, Model Performance Predictor (MPP), to track the performance of the models in deployment. We argue that an ensemble of such metrics can be used to create a score representing the prediction quality in production. This in turn facilitates formulation and customization of ML alerts, that can be escalated by an operations team to the data science team. Such a score automates monitoring and enables ML deployments at scale.