Benchmarking Dimensionality Reduction Techniques for Spatial Transcriptomics
This work provides a framework for selecting dimensionality reduction methods in spatial transcriptomics, which is incremental as it builds on existing techniques with systematic benchmarking.
The authors tackled the problem of evaluating dimensionality reduction techniques for spatial transcriptomics by benchmarking six methods on a cholangiocarcinoma dataset, finding that methods like VAE balanced reconstruction and interpretability with improvements such as up to 12% average gains in biologically-motivated scores.
We introduce a unified framework for evaluating dimensionality reduction techniques in spatial transcriptomics beyond standard PCA approaches. We benchmark six methods PCA, NMF, autoencoder, VAE, and two hybrid embeddings on a cholangiocarcinoma Xenium dataset, systematically varying latent dimensions ($k$=5-40) and clustering resolutions ($ρ$=0.1-1.2). Each configuration is evaluated using complementary metrics including reconstruction error, explained variance, cluster cohesion, and two novel biologically-motivated measures: Cluster Marker Coherence (CMC) and Marker Exclusion Rate (MER). Our results demonstrate distinct performance profiles: PCA provides a fast baseline, NMF maximizes marker enrichment, VAE balances reconstruction and interpretability, while autoencoders occupy a middle ground. We provide systematic hyperparameter selection using Pareto optimal analysis and demonstrate how MER-guided reassignment improves biological fidelity across all methods, with CMC scores improving by up to 12\% on average. This framework enables principled selection of dimensionality reduction methods tailored to specific spatial transcriptomics analyses.