LG MLOct 2, 2019

Benchmarking machine learning models on multi-centre eICU critical care dataset

Seyedmostafa Sheikhalishahi, Vevake Balaraman, Venet Osmani

arXiv:1910.00964v316.597 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a standardized evaluation framework for researchers in critical care, though it is incremental as it adapts existing benchmarking practices to a new domain.

The authors tackled the lack of public benchmarks in critical care by proposing a benchmark suite for four tasks (mortality prediction, length of stay estimation, patient phenotyping, and risk of decompensation) using the eICU dataset of around 73,000 patients, comparing clinical and machine learning models.

Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.

View on arXiv PDF Code

Similar