CLAIJun 12, 2025

BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP

arXiv:2506.10896v131 citationsh-index: 35Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of slow development and limited adaptation of encoders for biomedical and clinical NLP practitioners, representing an incremental improvement with domain-specific enhancements.

The paper tackled the limited domain adaptation of encoder models in biomedical and clinical NLP by introducing BioClinical ModernBERT, a domain-adapted encoder that outperforms existing models on four downstream tasks after pretraining on over 53.5 billion tokens from diverse datasets.

Encoder-based transformer models are central to biomedical and clinical Natural Language Processing (NLP), as their bidirectional self-attention makes them well-suited for efficiently extracting structured information from unstructured text through discriminative tasks. However, encoders have seen slower development compared to decoder models, leading to limited domain adaptation in biomedical and clinical settings. We introduce BioClinical ModernBERT, a domain-adapted encoder that builds on the recent ModernBERT release, incorporating long-context processing and substantial improvements in speed and performance for biomedical and clinical NLP. BioClinical ModernBERT is developed through continued pretraining on the largest biomedical and clinical corpus to date, with over 53.5 billion tokens, and addresses a key limitation of prior clinical encoders by leveraging 20 datasets from diverse institutions, domains, and geographic regions, rather than relying on data from a single source. It outperforms existing biomedical and clinical encoders on four downstream tasks spanning a broad range of use cases. We release both base (150M parameters) and large (396M parameters) versions of BioClinical ModernBERT, along with training checkpoints to support further research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes