Disentangled Variational Autoencoder based Multi-Label Classification with Covariance-Aware Multivariate Probit Model
This addresses the problem of predicting multiple correlated labels for applications like image or text tagging, representing an incremental improvement with a novel hybrid method.
The paper tackles multi-label classification by proposing MPVAE, a framework that learns latent embeddings and models label correlations using a multivariate probit model, achieving state-of-the-art performance on real-world datasets and showing robustness to noise.
Multi-label classification is the challenging task of predicting the presence and absence of multiple targets, involving representation learning and label correlation modeling. We propose a novel framework for multi-label classification, Multivariate Probit Variational AutoEncoder (MPVAE), that effectively learns latent embedding spaces as well as label correlations. MPVAE learns and aligns two probabilistic embedding spaces for labels and features respectively. The decoder of MPVAE takes in the samples from the embedding spaces and models the joint distribution of output targets under a Multivariate Probit model by learning a shared covariance matrix. We show that MPVAE outperforms the existing state-of-the-art methods on a variety of application domains, using public real-world datasets. MPVAE is further shown to remain robust under noisy settings. Lastly, we demonstrate the interpretability of the learned covariance by a case study on a bird observation dataset.