QM AI LGJun 15, 2024

Horizon-wise Learning Paradigm Promotes Gene Splicing Identification

arXiv:2406.11900v1Has Code

Originality Incremental advance

AI Analysis

This work addresses gene splicing identification, a key task in AI-bioinformatics collaboration, with incremental improvements in accuracy and efficiency.

The paper tackles gene splicing identification by proposing a horizon-wise learning paradigm that processes entire sequences in one forward pass, achieving 97.20% accuracy on the Human dataset and outperforming SpliceAI.

Identifying gene splicing is a core and significant task confronted in modern collaboration between artificial intelligence and bioinformatics. Past decades have witnessed great efforts on this concern, such as the bio-plausible splicing pattern AT-CG and the famous SpliceAI. In this paper, we propose a novel framework for the task of gene splicing identification, named Horizon-wise Gene Splicing Identification (H-GSI). The proposed H-GSI follows the horizon-wise identification paradigm and comprises four components: the pre-processing procedure transforming string data into tensors, the sliding window technique handling long sequences, the SeqLab model, and the predictor. In contrast to existing studies that process gene information with a truncated fixed-length sequence, H-GSI employs a horizon-wise identification paradigm in which all positions in a sequence are predicted with only one forward computation, improving accuracy and efficiency. The experiments conducted on the real-world Human dataset show that our proposed H-GSI outperforms SpliceAI and achieves the best accuracy of 97.20\%. The source code is available from this link.

View on arXiv PDF

Similar