QMLGJun 10, 2025

scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell Data

arXiv:2506.10031v12 citationsh-index: 11ICML
Originality Synthesis-oriented
AI Analysis

This work provides a standardized benchmark for researchers in single-cell genomics to guide the application of self-supervised learning methods, though it is incremental as it focuses on evaluation rather than introducing new methods.

The authors tackled the problem of evaluating self-supervised learning methods for single-cell data by creating scSSL-Bench, a benchmark that tested nineteen methods across nine datasets and three tasks, finding that specialized frameworks like scVI excel at batch correction while generic methods like VICReg perform better in cell typing and multi-modal integration.

Self-supervised learning (SSL) has proven to be a powerful approach for extracting biologically meaningful representations from single-cell data. To advance our understanding of SSL methods applied to single-cell data, we present scSSL-Bench, a comprehensive benchmark that evaluates nineteen SSL methods. Our evaluation spans nine datasets and focuses on three common downstream tasks: batch correction, cell type annotation, and missing modality prediction. Furthermore, we systematically assess various data augmentation strategies. Our analysis reveals task-specific trade-offs: the specialized single-cell frameworks, scVI, CLAIRE, and the finetuned scGPT excel at uni-modal batch correction, while generic SSL methods, such as VICReg and SimCLR, demonstrate superior performance in cell typing and multi-modal data integration. Random masking emerges as the most effective augmentation technique across all tasks, surpassing domain-specific augmentations. Notably, our results indicate the need for a specialized single-cell multi-modal data integration framework. scSSL-Bench provides a standardized evaluation platform and concrete recommendations for applying SSL to single-cell analysis, advancing the convergence of deep learning and single-cell genomics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes