Tree-Wasserstein Barycenter for Large-Scale Multilevel Clustering and Scalable Bayes
This work addresses scalability issues in multilevel clustering and Bayesian methods for large-scale applications, representing an incremental improvement through algorithmic optimization.
The paper tackles the computational challenge of Wasserstein barycenters for large-scale probability measures by introducing tree-Wasserstein barycenters, which leverage tree metrics to enable efficient algorithms with reduced memory usage, achieving faster computation in experiments on synthetic and real datasets.
We study in this paper a variant of Wasserstein barycenter problem, which we refer to as tree-Wasserstein barycenter, by leveraging a specific class of ground metrics, namely tree metrics, for Wasserstein distance. Drawing on the tree structure, we propose an efficient algorithmic approach to solve the tree-Wasserstein barycenter and its variants. The proposed approach is not only fast for computation but also efficient for memory usage. Exploiting the tree-Wasserstein barycenter and its variants, we scale up multi-level clustering and scalable Bayes, especially for large-scale applications where the number of supports in probability measures is large. Empirically, we test our proposed approach against other baselines on large-scale synthetic and real datasets.