Mehrtash Babadi

CVMay 25, 2025

Scalable Generation of Spatial Transcriptomics from Histology Images via Whole-Slide Flow Matching

Tinglin Huang, Tianyu Liu, Mehrtash Babadi et al.

Spatial transcriptomics (ST) has emerged as a powerful technology for bridging histology imaging with gene expression profiling. However, its application has been limited by low throughput and the need for specialized experimental facilities. Prior works sought to predict ST from whole-slide histology images to accelerate this process, but they suffer from two major limitations. First, they do not explicitly model cell-cell interaction as they factorize the joint distribution of whole-slide ST data and predict the gene expression of each spot independently. Second, their encoders struggle with memory constraints due to the large number of spots (often exceeding 10,000) in typical ST datasets. Herein, we propose STFlow, a flow matching generative model that considers cell-cell interaction by modeling the joint distribution of gene expression of an entire slide. It also employs an efficient slide-level encoder with local spatial attention, enabling whole-slide processing without excessive memory overhead. On the recently curated HEST-1k and STImage-1K4M benchmarks, STFlow substantially outperforms state-of-the-art baselines and achieves over 18% relative improvements over the pathology foundation models.

CVNov 25, 2020

CellSegmenter: unsupervised representation learning and instance segmentation of modular images

Luca D'Alessio, Mehrtash Babadi

We introduce CellSegmenter, a structured deep generative model and an amortized inference framework for unsupervised representation learning and instance segmentation tasks. The proposed inference algorithm is convolutional and parallelized, without any recurrent mechanisms, and is able to resolve object-object occlusion while simultaneously treating distant non-occluding objects independently. This leads to extremely fast training times while allowing extrapolation to arbitrary number of instances. We further introduce a transparent posterior regularization strategy that encourages scene reconstructions with fewest localized objects and a low-complexity background. We evaluate our method on a challenging synthetic multi-MNIST dataset with a structured background and achieve nearly perfect accuracy with only a few hundred training epochs. Finally, we show segmentation results obtained for a cell nuclei imaging dataset, demonstrating the ability of our method to provide high-quality segmentations while also handling realistic use cases involving large number of instances.

Mehrtash Babadi

2 Papers