Safe Semi-Supervised Learning of Sum-Product Networks
This addresses the challenge of expensive labeled data in domains with abundant unlabeled data, offering a safe and efficient semi-supervised method for SPNs, though it is incremental as it extends existing semi-supervised techniques to a new model type.
The paper tackles the problem of semi-supervised learning for Sum-Product Networks (SPNs), a deep probabilistic model, by introducing a parameter learning method that ensures performance does not degrade with unlabeled data. The result shows that this safe approach is competitive with state-of-the-art methods and improves generative and discriminative objectives compared to supervised learning.
In several domains obtaining class annotations is expensive while at the same time unlabelled data are abundant. While most semi-supervised approaches enforce restrictive assumptions on the data distribution, recent work has managed to learn semi-supervised models in a non-restrictive regime. However, so far such approaches have only been proposed for linear models. In this work, we introduce semi-supervised parameter learning for Sum-Product Networks (SPNs). SPNs are deep probabilistic models admitting inference in linear time in number of network edges. Our approach has several advantages, as it (1) allows generative and discriminative semi-supervised learning, (2) guarantees that adding unlabelled data can increase, but not degrade, the performance (safe), and (3) is computationally efficient and does not enforce restrictive assumptions on the data distribution. We show on a variety of data sets that safe semi-supervised learning with SPNs is competitive compared to state-of-the-art and can lead to a better generative and discriminative objective value than a purely supervised approach.