CLGNMar 6, 2025

Large Language Models in Bioinformatics: A Survey

arXiv:2503.04490v219 citationsh-index: 5ACL
AI Analysis

It provides a comprehensive overview for researchers in bioinformatics and AI, but is incremental as a survey paper.

This survey reviews how Large Language Models (LLMs) are applied in bioinformatics to analyze DNA, RNA, proteins, and single-cell data, highlighting their potential to drive innovations in precision medicine.

Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes