GNCLJul 15, 2024

Bridging Sequence-Structure Alignment in RNA Foundation Models

arXiv:2407.11242v36 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses a bottleneck in RNA genomics for researchers by enabling bidirectional sequence-structure mappings, though it is incremental as it builds on existing foundation model paradigms.

The study tackled the problem of aligning RNA sequences and structures in foundation models by introducing OmniGenome, which achieved 74% puzzle-solving on the EternaV2 benchmark compared to up to 3% for existing models and attained state-of-the-art performance on RNA and DNA benchmarks without DNA training.

The alignment between RNA sequences and structures in foundation models (FMs) has yet to be thoroughly investigated. Existing FMs have struggled to establish sequence-structure alignment, hindering the free flow of genomic information between RNA sequences and structures. In this study, we introduce OmniGenome, an RNA FM trained to align RNA sequences with respect to secondary structures based on structure-contextualised modelling. The alignment enables free and bidirectional mappings between sequences and structures by utilising the flexible RNA modelling paradigm that supports versatile input and output modalities, i.e., sequence and/or structure as input/output. We implement RNA design and zero-shot secondary structure prediction as case studies to evaluate the Seq2Str and Str2Seq mapping capacity of OmniGenome. Results on the EternaV2 benchmark show that OmniGenome solved 74% of puzzles, whereas existing FMs only solved up to 3% of the puzzles due to the oversight of sequence-structure alignment. We leverage four comprehensive in-silico genome modelling benchmarks to evaluate performance across a diverse set of genome downstream tasks, where the results show that OmniGenome achieves state-of-the-art performance on RNA and DNA benchmarks, even without any training on DNA genomes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes