MEAPCOMLJun 27, 2019

Simultaneous Transformation and Rounding (STAR) Models for Integer-Valued Data

arXiv:1906.11653v219 citations
Originality Incremental advance
AI Analysis

This provides a new method for statisticians and data scientists to model integer-valued data, such as counts and scores, with applications in healthcare and environmental studies, but it is incremental as it adapts existing Bayesian models to integer data.

The authors tackled the problem of modeling integer-valued data by proposing the Simultaneous Transformation and Rounding (STAR) framework, which produces flexible distributions for counts, scores, and rounded data, and demonstrated impressive predictive accuracy in synthetic and healthcare datasets.

We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The data-generating process is defined by Simultaneously Transforming and Rounding (STAR) a continuous-valued process, which produces a flexible family of integer-valued distributions capable of modeling zero-inflation, bounded or censored data, and over- or underdispersion. The transformation is modeled as unknown for greater distributional flexibility, while the rounding operation ensures a coherent integer-valued data-generating process. An efficient MCMC algorithm is developed for posterior inference and provides a mechanism for adaptation of successful Bayesian models and algorithms for continuous data to the integer-valued data setting. Using the STAR framework, we design a new Bayesian Additive Regression Tree (BART) model for integer-valued data, which demonstrates impressive predictive distribution accuracy for both synthetic data and a large healthcare utilization dataset. For interpretable regression-based inference, we develop a STAR additive model, which offers greater flexibility and scalability than existing integer-valued models. The STAR additive model is applied to study the recent decline in Amazon river dolphins.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes