On collapsed representation of hierarchical Completely Random Measures
This provides a method for Bayesian nonparametric modeling with hierarchical CRMs, particularly useful for topic modeling, but it is incremental as it builds on existing CRM and Poisson process frameworks.
The paper tackles the problem of generating Poisson processes from hierarchical Completely Random Measures (CRMs) without instantiating infinitely many atoms, by deriving an exact marginal distribution and Gibbs sampling strategies. As an example, it applies the sum of generalized gamma process to topic modeling, enabling Bayesian determination of power-law behavior in topics and words.
The aim of the paper is to provide an exact approach for generating a Poisson process sampled from a hierarchical CRM, without having to instantiate the infinitely many atoms of the random measures. We use completely random measures~(CRM) and hierarchical CRM to define a prior for Poisson processes. We derive the marginal distribution of the resultant point process, when the underlying CRM is marginalized out. Using well known properties unique to Poisson processes, we were able to derive an exact approach for instantiating a Poisson process with a hierarchical CRM prior. Furthermore, we derive Gibbs sampling strategies for hierarchical CRM models based on Chinese restaurant franchise sampling scheme. As an example, we present the sum of generalized gamma process (SGGP), and show its application in topic-modelling. We show that one can determine the power-law behaviour of the topics and words in a Bayesian fashion, by defining a prior on the parameters of SGGP.