AI LGNov 5, 2024

Evaluating Machine Learning Models against Clinical Protocols for Enhanced Interpretability and Continuity of Care

Christel Sirocchi, Muhammad Suffian, Federico Sabbatini, Alessandro Bogliolo, Sara Montagna

arXiv:2411.03105v12.3h-index: 5Has CodeEXPLIMED@ECAI

Originality Incremental advance

AI Analysis

This work addresses interpretability and continuity of care issues for clinicians using ML in medical decision-making, but it is incremental as it builds on existing integration approaches.

The paper tackled the problem of limited adoption of machine learning models in clinical practice by proposing metrics to compare ML models with clinical protocols for accuracy and explanation similarity, and validated on the Pima Indians Diabetes dataset, showing that an integrated model achieved comparable performance to data-driven models with superior accuracy over protocols and better-aligned explanations.

In clinical practice, decision-making relies heavily on established protocols, often formalised as rules. Concurrently, Machine Learning (ML) models, trained on clinical data, aspire to integrate into medical decision-making processes. However, despite the growing number of ML applications, their adoption into clinical practice remains limited. Two critical concerns arise, relevant to the notions of consistency and continuity of care: (a) accuracy - the ML model, albeit more accurate, might introduce errors that would not have occurred by applying the protocol; (b) interpretability - ML models operating as black boxes might make predictions based on relationships that contradict established clinical knowledge. In this context, the literature suggests using ML models integrating domain knowledge for improved accuracy and interpretability. However, there is a lack of appropriate metrics for comparing ML models with clinical rules in addressing these challenges. Accordingly, in this article, we first propose metrics to assess the accuracy of ML models with respect to the established protocol. Secondly, we propose an approach to measure the distance of explanations provided by two rule sets, with the goal of comparing the explanation similarity between clinical rule-based systems and rules extracted from ML models. The approach is validated on the Pima Indians Diabetes dataset by training two neural networks - one exclusively on data, and the other integrating a clinical protocol. Our findings demonstrate that the integrated ML model achieves comparable performance to that of a fully data-driven model while exhibiting superior accuracy relative to the clinical protocol, ensuring enhanced continuity of care. Furthermore, we show that our integrated model provides explanations for predictions that align more closely with the clinical protocol compared to the data-driven model.

View on arXiv PDF Code

Similar