PIVETed-Granite: Computational Phenotypes through Constrained Tensor Factorization
This work addresses the challenge of generating interpretable computational phenotypes for medical applications, though it is incremental as it builds on existing tensor factorization methods by incorporating domain knowledge.
The authors tackled the problem of extracting clinically meaningful phenotypes from multi-modal electronic health record data by introducing PIVETed-Granite, a method that uses cannot-link constraints mined from biomedical literature to improve tensor factorization, resulting in clearer and more distinct phenotypes for hypertensive patients in experiments on a large VUMC dataset.
It has been recently shown that sparse, nonnegative tensor factorization of multi-modal electronic health record data is a promising approach to high-throughput computational phenotyping. However, such approaches typically do not leverage available domain knowledge while extracting the phenotypes; hence, some of the suggested phenotypes may not map well to clinical concepts or may be very similar to other suggested phenotypes. To address these issues, we present a novel, automatic approach called PIVETed-Granite that mines existing biomedical literature (PubMed) to obtain cannot-link constraints that are then used as side-information during a tensor-factorization based computational phenotyping process. The resulting improvements are clearly observed in experiments using a large dataset from VUMC to identify phenotypes for hypertensive patients.