LG MLJun 30, 2016

Unsupervised Learning with Imbalanced Data via Structure Consolidation Latent Variable Model

Fariba Yousefi, Zhenwen Dai, Carl Henrik Ek, Neil Lawrence

arXiv:1607.00067v11.0

Originality Incremental advance

AI Analysis

This addresses the problem of handling imbalanced data in unsupervised learning, particularly for medical image analysis, but appears incremental as it builds on existing Gaussian Process Latent Variable Models.

The paper tackled the challenge of unsupervised learning on imbalanced data, where models are often dominated by major categories, by developing a latent variable model that divides the latent space into shared and private spaces, achieving demonstrated performance on an imbalanced medical image dataset.

Unsupervised learning on imbalanced data is challenging because, when given imbalanced data, current model is often dominated by the major category and ignores the categories with small amount of data. We develop a latent variable model that can cope with imbalanced data by dividing the latent space into a shared space and a private space. Based on Gaussian Process Latent Variable Models, we propose a new kernel formulation that enables the separation of latent space and derives an efficient variational inference method. The performance of our model is demonstrated with an imbalanced medical image dataset.

View on arXiv PDF

Similar