LGMLOct 2, 2019

Benchmarking machine learning models on multi-centre eICU critical care dataset

arXiv:1910.00964v397 citations
Originality Synthesis-oriented
AI Analysis

This provides a standardized evaluation framework for researchers in critical care, though it is incremental as it adapts existing benchmarking practices to a new domain.

The authors tackled the lack of public benchmarks in critical care by proposing a benchmark suite for four tasks (mortality prediction, length of stay estimation, patient phenotyping, and risk of decompensation) using the eICU dataset of around 73,000 patients, comparing clinical and machine learning models.

Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes