LGApr 8

Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering

arXiv:2604.0708519.4
AI Analysis

This work addresses the challenge of improving clustering accuracy for heart failure patients using EHR data, which is incremental as it combines existing methods in a novel ensemble framework.

The paper tackled the problem of clustering patients and distinguishing disease subtypes in electronic health records (EHRs) by investigating traditional, hybrid, and deep learning methods, and introduced an ensemble-based deep clustering approach that achieved the best overall performance ranking across 14 methods and multiple patient cohorts.

In electronic health records (EHRs), clustering patients and distinguishing disease subtypes are key tasks to elucidate pathophysiology and aid clinical decision-making. However, clustering in healthcare informatics is still based on traditional methods, especially K-means, and has achieved limited success when applied to embedding representations learned by autoencoders as hybrid methods. This paper investigates the effectiveness of traditional, hybrid, and deep learning methods in heart failure patient cohorts using real EHR data from the All of Us Research Program. Traditional clustering methods perform robustly because deep learning approaches are specifically designed for image clustering, a task that differs substantially from the tabular EHR data setting. To address the shortcomings of deep clustering, we introduce an ensemble-based deep clustering approach that aggregates cluster assignments obtained from multiple embedding dimensions, rather than relying on a single fixed embedding space. When combined with traditional clustering in a novel ensemble framework, the proposed ensemble embedding for deep clustering delivers the best overall performance ranking across 14 diverse clustering methods and multiple patient cohorts. This paper underscores the importance of biological sex-specific clustering of EHR data and the advantages of combining traditional and deep clustering approaches over a single method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes