ML LGSep 25, 2019

Stochastic Prototype Embeddings

Tyler R. Scott, Karl Ridgeway, Michael C. Mozer

arXiv:1909.11702v18.315 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more robust and interpretable embedding methods in machine learning, particularly for few-shot learning and noisy data scenarios, though it is incremental as it builds on existing deterministic and stochastic approaches.

The paper tackles the problem of improving supervised deep-embedding methods by introducing a probabilistic approach that treats embeddings as random variables, extending Prototypical Networks to handle uncertainty. It results in superior large- and open-set classification accuracy compared to a state-of-the-art stochastic method, with improved performance on few-shot learning and better handling of label noise and out-of-distribution inputs.

Supervised deep-embedding methods project inputs of a domain to a representational space in which same-class instances lie near one another and different-class instances lie far apart. We propose a probabilistic method that treats embeddings as random variables. Extending a state-of-the-art deterministic method, Prototypical Networks (Snell et al., 2017), our approach supposes the existence of a class prototype around which class instances are Gaussian distributed. The prototype posterior is a product distribution over labeled instances, and query instances are classified by marginalizing relative prototype proximity over embedding uncertainty. We describe an efficient sampler for approximate inference that allows us to train the model at roughly the same space and time cost as its deterministic sibling. Incorporating uncertainty improves performance on few-shot learning and gracefully handles label noise and out-of-distribution inputs. Compared to the state-of-the-art stochastic method, Hedged Instance Embeddings (Oh et al., 2019), we achieve superior large- and open-set classification accuracy. Our method also aligns class-discriminating features with the axes of the embedding space, yielding an interpretable, disentangled representation.

View on arXiv PDF

Similar