Monitoring and explainability of models in production
This work tackles the problem of ensuring reliable ML services in production for practitioners, but it is incremental as it reviews existing techniques and tools.
The paper addresses the challenge of monitoring and explaining machine learning models after deployment to maintain service quality, covering performance tracking, data drift detection, and prediction explanations with examples of open-source tools.
The machine learning lifecycle extends beyond the deployment stage. Monitoring deployed models is crucial for continued provision of high quality machine learning enabled services. Key areas include model performance and data monitoring, detecting outliers and data drift using statistical techniques, and providing explanations of historic predictions. We discuss the challenges to successful implementation of solutions in each of these areas with some recent examples of production ready solutions using open source tools.