CVLGNENov 19, 2015

Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks

arXiv:1511.06314v1338 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing ensemble performance for machine learning practitioners, though it is incremental as it builds on existing ensembling methods.

The paper tackles the problem of creating effective ensembles of deep neural networks by proposing and evaluating novel strategies like TreeNets and diversity-encouraging losses, achieving significantly higher oracle accuracies than classical ensembles.

Convolutional Neural Networks have achieved state-of-the-art performance on a wide range of tasks. Most benchmarks are led by ensembles of these powerful learners, but ensembling is typically treated as a post-hoc procedure implemented by averaging independently trained models with model variation induced by bagging or random initialization. In this paper, we rigorously treat ensembling as a first-class problem to explicitly address the question: what are the best strategies to create an ensemble? We first compare a large number of ensembling strategies, and then propose and evaluate novel strategies, such as parameter sharing (through a new family of models we call TreeNets) as well as training under ensemble-aware and diversity-encouraging losses. We demonstrate that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss, achieving significantly higher "oracle" accuracies than classical ensembles.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes