Easy Variational Inference for Categorical Models via an Independent Binary Approximation
This work addresses scalability issues in Bayesian analysis for categorical data, which is incremental as it builds on existing binary models and variational methods.
The authors tackled the challenge of scaling Bayesian inference for categorical generalized linear models (GLMs) beyond a few dozen categories by introducing categorical-from-binary (CB) models, which enable fast and parallelizable variational inference, scaling to thousands of categories and outperforming competitors like ADVI and NUTS in time to achieve fixed prediction quality.
We pursue tractable Bayesian analysis of generalized linear models (GLMs) for categorical data. Thus far, GLMs are difficult to scale to more than a few dozen categories due to non-conjugacy or strong posterior dependencies when using conjugate auxiliary variable methods. We define a new class of GLMs for categorical data called categorical-from-binary (CB) models. Each CB model has a likelihood that is bounded by the product of binary likelihoods, suggesting a natural posterior approximation. This approximation makes inference straightforward and fast; using well-known auxiliary variables for probit or logistic regression, the product of binary models admits conjugate closed-form variational inference that is embarrassingly parallel across categories and invariant to category ordering. Moreover, an independent binary model simultaneously approximates multiple CB models. Bayesian model averaging over these can improve the quality of the approximation for any given dataset. We show that our approach scales to thousands of categories, outperforming posterior estimation competitors like Automatic Differentiation Variational Inference (ADVI) and No U-Turn Sampling (NUTS) in the time required to achieve fixed prediction quality.