Assessment of the influence of features on a classification problem: an application to COVID-19 patients
This provides a theoretically grounded feature importance measure for classification problems, applied to COVID-19 data, though it appears incremental as it adapts existing Shapley value concepts to this context.
The paper tackles the problem of evaluating feature importance in classification problems by introducing a measure based on Shapley values from cooperative game theory, with an axiomatic characterization and experimental validation. It applies this methodology to COVID-19 patient data to study the influence of demographic and risk factors on disease evolution.
This paper deals with an important subject in classification problems addressed by machine learning techniques: the evaluation of the influence of each of the features on the classification of individuals. Specifically, a measure of that influence is introduced using the Shapley value of cooperative games. In addition, an axiomatic characterisation of the proposed measure is provided based on properties of efficiency and balanced contributions. Furthermore, some experiments have been designed in order to validate the appropriate performance of such measure. Finally, the methodology introduced is applied to a sample of COVID-19 patients to study the influence of certain demographic or risk factors on various events of interest related to the evolution of the disease.