GNAILGMay 19, 2025

scSiameseClu: A Siamese Clustering Framework for Interpreting single-cell RNA Sequencing Data

arXiv:2505.12626v38 citationsh-index: 18IJCAI
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for researchers in bioinformatics and genomics by improving cell type identification, though it appears incremental as it builds on existing graph neural network methods.

The paper tackled the challenges of noise, sparsity, and over-smoothing in single-cell RNA sequencing data clustering by proposing scSiameseClu, a Siamese clustering framework that outperformed state-of-the-art methods on seven real-world datasets.

Single-cell RNA sequencing (scRNA-seq) reveals cell heterogeneity, with cell clustering playing a key role in identifying cell types and marker genes. Recent advances, especially graph neural networks (GNNs)-based methods, have significantly improved clustering performance. However, the analysis of scRNA-seq data remains challenging due to noise, sparsity, and high dimensionality. Compounding these challenges, GNNs often suffer from over-smoothing, limiting their ability to capture complex biological information. In response, we propose scSiameseClu, a novel Siamese Clustering framework for interpreting single-cell RNA-seq data, comprising of 3 key steps: (1) Dual Augmentation Module, which applies biologically informed perturbations to the gene expression matrix and cell graph relationships to enhance representation robustness; (2) Siamese Fusion Module, which combines cross-correlation refinement and adaptive information fusion to capture complex cellular relationships while mitigating over-smoothing; and (3) Optimal Transport Clustering, which utilizes Sinkhorn distance to efficiently align cluster assignments with predefined proportions while maintaining balance. Comprehensive evaluations on seven real-world datasets demonstrate that scSiameseClu outperforms state-of-the-art methods in single-cell clustering, cell type annotation, and cell type classification, providing a powerful tool for scRNA-seq data interpretation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes