Decoding Translation-Related Functional Sequences in 5'UTRs Using Interpretable Deep Learning Models
This work addresses the need for flexible and interpretable deep learning models in genomics to understand mRNA translation regulation, with potential applications in therapeutic mRNA design, but it is incremental as it builds on existing Transformer architectures.
The paper tackled the problem of predicting translational efficiency from variable-length 5'UTR sequences by introducing UTR-STCNet, a Transformer-based model with interpretable modules, which outperformed state-of-the-art baselines on benchmark datasets for mean ribosome load prediction.
Understanding how 5' untranslated regions (5'UTRs) regulate mRNA translation is critical for controlling protein expression and designing effective therapeutic mRNAs. While recent deep learning models have shown promise in predicting translational efficiency from 5'UTR sequences, most are constrained by fixed input lengths and limited interpretability. We introduce UTR-STCNet, a Transformer-based architecture for flexible and biologically grounded modeling of variable-length 5'UTRs. UTR-STCNet integrates a Saliency-Aware Token Clustering (SATC) module that iteratively aggregates nucleotide tokens into multi-scale, semantically meaningful units based on saliency scores. A Saliency-Guided Transformer (SGT) block then captures both local and distal regulatory dependencies using a lightweight attention mechanism. This combined architecture achieves efficient and interpretable modeling without input truncation or increased computational cost. Evaluated across three benchmark datasets, UTR-STCNet consistently outperforms state-of-the-art baselines in predicting mean ribosome load (MRL), a key proxy for translational efficiency. Moreover, the model recovers known functional elements such as upstream AUGs and Kozak motifs, highlighting its potential for mechanistic insight into translation regulation.