Deep Learning Approaches for Blood Disease Diagnosis Across Hematopoietic Lineages
This work addresses blood disease diagnosis for medical applications, but it is incremental as it applies existing deep learning methods to a specific domain.
The authors tackled the problem of diagnosing blood diseases across hematopoietic lineages by developing a deep learning framework that reduces over 20,000 gene features to a 256-dimensional latent space, achieving over 95% accuracy in multi-class classification and over 0.7 F1-score in zero-shot binary classification.
We present a foundation modeling framework that leverages deep learning to uncover latent genetic signatures across the hematopoietic hierarchy. Our approach trains a fully connected autoencoder on multipotent progenitor cells, reducing over 20,000 gene features to a 256-dimensional latent space that captures predictive information for both progenitor and downstream differentiated cells such as monocytes and lymphocytes. We validate the quality of these embeddings by training feed-forward, transformer, and graph convolutional architectures for blood disease diagnosis tasks. We also explore zero-shot prediction using a progenitor disease state classification model to classify downstream cell conditions. Our models achieve greater than 95% accuracy for multi-class classification, and in the zero-shot setting, we achieve greater than 0.7 F1-score on the binary classification task. Future work should improve embeddings further to increase robustness on lymphocyte classification specifically.