LG MLMar 22, 2018

End-to-End Learning for the Deep Multivariate Probit Model

arXiv:1803.08591v47.926 citations

Originality Incremental advance

AI Analysis

This work addresses the computational bottleneck for researchers and practitioners using MVP in multi-entity modeling problems, offering a faster and more effective method, though it is incremental as it builds on the classic MVP with deep learning enhancements.

The authors tackled the computational challenge of learning the multivariate probit model (MVP) by proposing the Deep Multivariate Probit Model (DMVP), an end-to-end learning scheme that uses efficient parallel sampling and GPU-boosted deep neural networks, resulting in training at least an order of magnitude faster than classical MVP and improved joint likelihood compared to competitive models.

The multivariate probit model (MVP) is a popular classic model for studying binary responses of multiple entities. Nevertheless, the computational challenge of learning the MVP model, given that its likelihood involves integrating over a multidimensional constrained space of latent variables, significantly limits its application in practice. We propose a flexible deep generalization of the classic MVP, the Deep Multivariate Probit Model (DMVP), which is an end-to-end learning scheme that uses an efficient parallel sampling process of the multivariate probit model to exploit GPU-boosted deep neural networks. We present both theoretical and empirical analysis of the convergence behavior of DMVP's sampling process with respect to the resolution of the correlation structure. We provide convergence guarantees for DMVP and our empirical analysis demonstrates the advantages of DMVP's sampling compared with standard MCMC-based methods. We also show that when applied to multi-entity modelling problems, which are natural DMVP applications, DMVP trains faster than classical MVP, by at least an order of magnitude, captures rich correlations among entities, and further improves the joint likelihood of entities compared with several competitive models.

View on arXiv PDF

Similar