LGAINov 29, 2023

Gene-MOE: A sparsely gated prognosis and classification framework exploiting pan-cancer genomic information

arXiv:2311.17401v3h-index: 11
Originality Incremental advance
AI Analysis

This work addresses overfitting challenges in cancer genomics for researchers and clinicians, offering improved accuracy in classification and survival prediction, though it is incremental as it builds on existing deep learning and MOE techniques.

The paper tackled the problem of overfitting in genomic analysis due to limited patient samples by introducing Gene-MOE, a sparsely gated framework using mixture of expert and attention layers with pan-cancer pre-training, achieving 95.8% accuracy in cancer classification and outperforming state-of-the-art models in survival analysis on 12 out of 14 cancer types.

Benefiting from the advancements in deep learning, various genomic analytical techniques, such as survival analysis, classification of tumors and their subtypes, and exploration of specific pathways, have significantly enhanced our understanding of the biological mechanisms driving cancer. However, the overfitting issue, arising from the limited number of patient samples, poses a challenge in improving the accuracy of genome analysis by deepening the neural network. Furthermore, it remains uncertain whether novel approaches such as the sparsely gated mixture of expert (MOE) and self-attention mechanisms can improve the accuracy of genomic analysis. In this paper, we introduce a novel sparsely gated RNA-seq analysis framework called Gene-MOE. This framework exploits the potential of the MOE layers and the proposed mixture of attention expert (MOAE) layers to enhance the analysis accuracy. Additionally, it addresses overfitting challenges by integrating pan-cancer information from 33 distinct cancer types through pre-training.We pre-trained Gene-MOE on TCGA pan-cancer RNA-seq dataset with 33 cancer types. Subsequently, we conducted experiments involving cancer classification and survival analysis based on the pre-trained Gene-MOE. According to the survival analysis results on 14 cancer types, Gene-MOE outperformed state-of-the-art models on 12 cancer types. Through detailed feature analysis, we found that the Gene-MOE model could learn rich feature representations of high-dimensional genes. According to the classification results, the total accuracy of the classification model for 33 cancer classifications reached 95.8%, representing the best performance compared to state-of-the-art models. These results indicate that Gene-MOE holds strong potential for use in cancer classification and survival analysis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes