MLNov 20, 2017

Subgroup Identification and Interpretation with Bayesian Nonparametric Models in Health Care Claims Data

arXiv:1711.07527v1

Originality Incremental advance

AI Analysis

This work addresses the problem of understanding health care spending growth for policymakers and researchers by providing an incremental method to analyze complex inpatient data.

The paper tackled the challenge of modeling inpatient utilization patterns, which are complicated by zero inflation, over-dispersion, and skewness, by developing a Bayesian nonparametric model that identifies distinct patient subgroups in lung cancer hospital length-of-stay data, revealing differences in means, variances, and covariate relationships.

Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding what drives health care spending growth. Common features of inpatient utilization measures include zero inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Mixture modeling is a popular approach that can accommodate these features of health care utilization data. In this work, we add a nonparametric clustering component to such models. Our fully Bayesian model framework allows for an unknown number of mixing components, so that the data determine the number of mixture components. When we apply the modeling framework to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with differences in means and variances of hospital days, health and treatment covariates, and relationships between covariates and length of stay.

View on arXiv PDF

Similar