MLLGNov 26, 2019

Scalable Extreme Deconvolution

arXiv:1911.11663v11 citations
Originality Incremental advance
AI Analysis

This enables efficient probabilistic density estimation for large-scale noisy datasets, such as in astronomy, though it is incremental as it adapts existing methods for scalability.

The paper tackled the problem of scaling the Extreme Deconvolution method to large datasets like the Gaia catalog with a billion stars, by proposing minibatch variants based on online EM and gradient optimization that run on GPUs, resulting in faster fitting and scalability to larger models.

The Extreme Deconvolution method fits a probability density to a dataset where each observation has Gaussian noise added with a known sample-specific covariance, originally intended for use with astronomical datasets. The existing fitting method is batch EM, which would not normally be applied to large datasets such as the Gaia catalog containing noisy observations of a billion stars. We propose two minibatch variants of extreme deconvolution, based on an online variation of the EM algorithm, and direct gradient-based optimisation of the log-likelihood, both of which can run on GPUs. We demonstrate that these methods provide faster fitting, whilst being able to scale to much larger models for use with larger datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes