Lasso based feature selection for malaria risk exposure prediction
This work addresses the need for automated decision support in life sciences, potentially aiding or replacing expert knowledge in malaria risk prediction, though it appears incremental as it builds on existing Lasso and GLM methods.
The paper tackled the problem of automating variable selection for malaria risk exposure prediction by proposing a novel approach that uses Lasso with double cross-validation to select optimal subsets and GLM for predictions, resulting in stable and consistent estimators.
In life sciences, the experts generally use empirical knowledge to recode variables, choose interactions and perform selection by classical approach. The aim of this work is to perform automatic learning algorithm for variables selection which can lead to know if experts can be help in they decision or simply replaced by the machine and improve they knowledge and results. The Lasso method can detect the optimal subset of variables for estimation and prediction under some conditions. In this paper, we propose a novel approach which uses automatically all variables available and all interactions. By a double cross-validation combine with Lasso, we select a best subset of variables and with GLM through a simple cross-validation perform predictions. The algorithm assures the stability and the the consistency of estimators.