Spatiotemporal Clustering with Neyman-Scott Processes via Connections to Bayesian Nonparametric Mixture Models
This work addresses the problem of modeling spatiotemporal clustering for researchers in fields like neuroscience and text analysis, offering a scalable inference method that is incremental by building on existing Bayesian nonparametric techniques.
The paper tackles the challenge of scalable Bayesian inference for Neyman-Scott processes (NSPs) by establishing novel connections to Dirichlet process mixture models (DPMMs) and mixture of finite mixture models (MFMMs), resulting in a collapsed Gibbs sampling algorithm that enables efficient clustering in spatiotemporal data applications like neural spike trains and document streams.
Neyman-Scott processes (NSPs) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.