CVCLJun 24, 2016

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

arXiv:1606.07839v3195 citations
Originality Highly original
AI Analysis

This work addresses the need for multiple predictions in systems with user interactions or oracle evaluations, offering a parameter-free and architecture-agnostic solution.

The paper tackles the problem of producing multiple highly likely hypotheses for perception systems by training diverse deep ensembles, achieving lower oracle error compared to existing methods across various tasks and architectures.

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions. In these contexts, it is beneficial to provide these oracle mechanisms with multiple highly likely hypotheses rather than a single prediction. In this work, we pose the task of producing multiple outputs as a learning problem over an ensemble of deep networks -- introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle. Our method is simple to implement, agnostic to both architecture and loss function, and parameter-free. Our approach achieves lower oracle error compared to existing methods on a wide range of tasks and deep architectures. We also show qualitatively that the diverse solutions produced often provide interpretable representations of task ambiguity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes