BMAIGNOct 3, 2025

SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations

arXiv:2510.02734v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses interpretability for researchers using RNA language models, though it is incremental as it applies existing interpretability methods to a new domain.

The authors tackled the problem of interpreting what RNA language models encode about RNA families by developing SAE-RNA, a sparse autoencoder model that maps RiNALMo representations to known biological features, enabling concept discovery without retraining.

Deep learning, particularly with the advancement of Large Language Models, has transformed biomolecular modeling, with protein advances (e.g., ESM) inspiring emerging RNA language models such as RiNALMo. Yet how and what these RNA Language Models internally encode about messenger RNA (mRNA) or non-coding RNA (ncRNA) families remains unclear. We present SAE- RNA, interpretability model that analyzes RiNALMo representations and maps them to known human-level biological features. Our work frames RNA interpretability as concept discovery in pretrained embeddings, without end-to-end retraining, and provides practical tools to probe what RNA LMs may encode about ncRNA families. The model can be extended to close comparisons between RNA groups, and supporting hypothesis generation about previously unrecognized relationships.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes