LGMay 20

Symbolic Density Estimation for Discrete Distributions

arXiv:2605.2181362.1
Predicted impact top 34% in LG · last 90 daysOriginality Highly original
AI Analysis

For statisticians and data scientists, SDE automates the discovery of interpretable discrete distributions, reducing reliance on manual derivations.

The paper introduces symbolic density estimation (SDE), an unsupervised framework that automatically discovers closed-form probability mass functions for discrete distributions. The method recovers all benchmark families with accurate parameter estimates and identifies interpretable mixture models that improve goodness-of-fit over standard models on real data.

Discrete probability laws underpin statistical modeling, yet the catalog of interpretable distributions has expanded only gradually through centuries of case-by-case mathematical derivations. We introduce symbolic density estimation (SDE), an unsupervised framework that automatically recovers closed-form probability mass functions by composing elementary analytic operations within a structured search space. Our method integrates domain-specific structural priors with evolutionary search and a validity-aware inference stage, and it extends to richer distribution families such as zero inflation and finite mixtures. To support systematic evaluation and future research, we contribute a benchmark dataset spanning a broad collection of commonly used discrete distributions. The proposed algorithm recovers all benchmark families with accurate parameter estimates. A real data application shows that it identifies concise and interpretable mixture models that improve goodness-of-fit over standard models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes