LGGNQMMar 16, 2025

GCBLANE: A graph-enhanced convolutional BiLSTM attention network for improved transcription factor binding site prediction

arXiv:2503.12377v1h-index: 2
Originality Incremental advance
AI Analysis

This work addresses the challenge of TFBS prediction for genomics researchers, representing an incremental improvement by integrating graph-based learning with existing sequence analysis methods.

The paper tackled the problem of accurately identifying transcription factor binding sites (TFBS) in genomic data by introducing GCBLANE, a graph-enhanced convolutional BiLSTM attention network, which achieved an average AUC of 0.943 on 690 ENCODE datasets and 0.9495 on 165 ENCODE datasets, outperforming advanced models.

Identifying transcription factor binding sites (TFBS) is crucial for understanding gene regulation, as these sites enable transcription factors (TFs) to bind to DNA and modulate gene expression. Despite advances in high-throughput sequencing, accurately identifying TFBS remains challenging due to the vast genomic data and complex binding patterns. GCBLANE, a graph-enhanced convolutional bidirectional Long Short-Term Memory (LSTM) attention network, is introduced to address this issue. It integrates convolutional, multi-head attention, and recurrent layers with a graph neural network to detect key features for TFBS prediction. On 690 ENCODE ChIP-Seq datasets, GCBLANE achieved an average AUC of 0.943, and on 165 ENCODE datasets, it reached an AUC of 0.9495, outperforming advanced models that utilize multimodal approaches, including DNA shape information. This result underscores GCBLANE's effectiveness compared to other methods. By combining graph-based learning with sequence analysis, GCBLANE significantly advances TFBS prediction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes