Out-of-Distribution Detection for Medical Applications: Guidelines for Practical Evaluation
This work provides incremental guidelines to help practitioners in healthcare mitigate risks when deploying machine learning models by improving OOD detection.
The paper addresses the lack of evaluation guidelines for selecting Out-of-Distribution (OOD) detection methods in medical applications, proposing practical considerations and tests illustrated on an Electronic Health Records use case to facilitate implementation in clinical practice.
Detection of Out-of-Distribution (OOD) samples in real time is a crucial safety check for deployment of machine learning models in the medical field. Despite a growing number of uncertainty quantification techniques, there is a lack of evaluation guidelines on how to select OOD detection methods in practice. This gap impedes implementation of OOD detection methods for real-world applications. Here, we propose a series of practical considerations and tests to choose the best OOD detector for a specific medical dataset. These guidelines are illustrated on a real-life use case of Electronic Health Records (EHR). Our results can serve as a guide for implementation of OOD detection methods in clinical practice, mitigating risks associated with the use of machine learning models in healthcare.