ML LG NEDec 16, 2014

Learning with Pseudo-Ensembles

Philip Bachman, Ouais Alsharif, Doina Precup

arXiv:1412.4864v1657 citations

Originality Highly original

AI Analysis

This work addresses the challenge of enhancing model generalization and semi-supervised learning for machine learning practitioners, offering a novel approach that extends beyond existing methods like dropout.

The paper tackles the problem of improving model robustness and performance by formalizing pseudo-ensembles, which are collections of child models generated by perturbing a parent model, and introduces a novel regularizer based on this concept. The result is that the regularizer matches dropout in supervised settings and achieves state-of-the-art results in semi-supervised settings, with a case study showing significant performance improvement on a sentiment analysis benchmark.

We formalize the notion of a pseudo-ensemble, a (possibly infinite) collection of child models spawned from a parent model by perturbing it according to some noise process. E.g., dropout (Hinton et. al, 2012) in a deep neural network trains a pseudo-ensemble of child subnetworks generated by randomly masking nodes in the parent network. We present a novel regularizer based on making the behavior of a pseudo-ensemble robust with respect to the noise process generating it. In the fully-supervised setting, our regularizer matches the performance of dropout. But, unlike dropout, our regularizer naturally extends to the semi-supervised setting, where it produces state-of-the-art results. We provide a case study in which we transform the Recursive Neural Tensor Network of (Socher et. al, 2013) into a pseudo-ensemble, which significantly improves its performance on a real-world sentiment analysis benchmark.

View on arXiv PDF

Similar