SparGE: Sparse Coding-based Patient Similarity Learning via Low-rank Constraints and Graph Embedding
This work addresses patient similarity assessment for personalized medicine, but it appears incremental as it builds on existing sparse coding and graph embedding techniques.
The authors tackled the problem of patient similarity assessment from electronic health records by proposing SparGE, a framework that uses sparse coding and graph embedding to handle data deficiencies like missing values and noise, achieving significant performance improvements over other methods on real-world datasets.
Patient similarity assessment (PSA) is pivotal to evidence-based and personalized medicine, enabled by analyzing the increasingly available electronic health records (EHRs). However, machine learning approaches for PSA has to deal with inherent data deficiencies of EHRs, namely missing values, noise, and small sample sizes. In this work, an end-to-end discriminative learning framework, called SparGE, is proposed to address these data challenges of EHR for PSA. SparGE measures similarity by jointly sparse coding and graph embedding. First, we use low-rank constrained sparse coding to identify and calculate weight for similar patients, while denoising against missing values. Then, graph embedding on sparse representations is adopted to measure the similarity between patient pairs via preserving local relationships defined by distances. Finally, a global cost function is constructed to optimize related parameters. Experimental results on two private and public real-world healthcare datasets, namely SingHEART and MIMIC-III, show that the proposed SparGE significantly outperforms other machine learning patient similarity methods.