LG SP MLJun 16, 2019

Addressing database variability in learning from medical data: an ensemble-based approach using convolutional neural networks and a case of study applied to automatic sleep scoring

Diego Alvarez-Estevez, Isaac Fernández-Varela

arXiv:1906.06666v33.421 citations

Originality Incremental advance

AI Analysis

This addresses robust generalization across multiple databases in sleep medicine, though it appears incremental as it builds on existing CNN methods.

The paper tackles the database variability problem in medical machine learning, specifically for sleep staging, by proposing an ensemble of local models to improve inter-database generalization and data scalability, showing advantages in performance.

In this work we examine some of the problems associated with the development of machine learning models with the objective to achieve robust generalization capabilities on common-task multiple-database scenarios. Referred to as the "database variability problem", we focus on a specific medical domain (sleep staging in sleep medicine) to show the non-triviality of translating the estimated model's local generalization capabilities into independent external databases. We analyze some of the scalability problems when multiple-database data are used as inputs to train a single learning model. Then, we introduce a novel approach based on an ensemble of local models, and we show its advantages in terms of inter-database generalization performance and data scalability. In addition, we analyze different model configurations and data pre-processing techniques to determine their effects on the overall generalization performance. For this purpose, we carry out experimentation that involves several sleep databases and evaluates different machine learning models based on convolutional neural networks

View on arXiv PDF

Similar