LGMar 26, 2025

Multi-dataset and Transfer Learning Using Gene Expression Knowledge Graphs

arXiv:2503.20400v1h-index: 2Has CodeESWC
Originality Incremental advance
AI Analysis

This work addresses data integration and scalability issues in biomedical machine learning for disease diagnosis, though it appears incremental as it builds on existing knowledge graph and embedding techniques.

The paper tackled the challenge of limited and incompatible gene expression datasets for patient diagnosis by integrating multiple datasets and domain-specific knowledge using knowledge graphs, resulting in improved diagnosis across single-dataset, multi-dataset, and transfer learning settings.

Gene expression datasets offer insights into gene regulation mechanisms, biochemical pathways, and cellular functions. Additionally, comparing gene expression profiles between disease and control patients can deepen the understanding of disease pathology. Therefore, machine learning has been used to process gene expression data, with patient diagnosis emerging as one of the most popular applications. Although gene expression data can provide valuable insights, challenges arise because the number of patients in expression datasets is usually limited, and the data from different datasets with different gene expressions cannot be easily combined. This work proposes a novel methodology to address these challenges by integrating multiple gene expression datasets and domain-specific knowledge using knowledge graphs, a unique tool for biomedical data integration. Then, vector representations are produced using knowledge graph embedding techniques, which are used as inputs for a graph neural network and a multi-layer perceptron. We evaluate the efficacy of our methodology in three settings: single-dataset learning, multi-dataset learning, and transfer learning. The experimental results show that combining gene expression datasets and domain-specific knowledge improves patient diagnosis in all three settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes