LGAIOct 28, 2025

Transformers can do Bayesian Clustering

arXiv:2510.24318v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses scalable and flexible Bayesian clustering for real-world datasets with missing values, representing an incremental improvement by adapting existing PFNs to unsupervised clustering.

The paper tackles the computational demands and uncertainty handling in Bayesian clustering, especially with missing data, by introducing Cluster-PFN, a Transformer-based model that estimates posterior distributions over clusters and assignments, achieving more accurate cluster number estimation than traditional methods and competitive clustering quality with orders of magnitude faster speed.

Bayesian clustering accounts for uncertainty but is computationally demanding at scale. Furthermore, real-world datasets often contain missing values, and simple imputation ignores the associated uncertainty, resulting in suboptimal results. We present Cluster-PFN, a Transformer-based model that extends Prior-Data Fitted Networks (PFNs) to unsupervised Bayesian clustering. Trained entirely on synthetic datasets generated from a finite Gaussian Mixture Model (GMM) prior, Cluster-PFN learns to estimate the posterior distribution over both the number of clusters and the cluster assignments. Our method estimates the number of clusters more accurately than handcrafted model selection procedures such as AIC, BIC and Variational Inference (VI), and achieves clustering quality competitive with VI while being orders of magnitude faster. Cluster-PFN can be trained on complex priors that include missing data, outperforming imputation-based baselines on real-world genomic datasets, at high missingness. These results show that the Cluster-PFN can provide scalable and flexible Bayesian clustering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes