LG AI CVJan 28, 2023

Anticipate, Ensemble and Prune: Improving Convolutional Neural Networks via Aggregated Early Exits

Simone Sarti, Eugenio Lomurno, Matteo Matteucci

arXiv:2301.12168v16.66 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses efficiency and performance issues in neural networks, particularly for edge computing, though it appears incremental as it builds on existing early exit methods.

The paper tackles the problem of underutilized intermediate information in convolutional neural networks by introducing Anticipate, Ensemble and Prune (AEP), a training technique using weighted ensembles of early exits, which improves average accuracy by up to 15% over traditional training while reducing parameters by up to 41% and latency by 16%.

Today, artificial neural networks are the state of the art for solving a variety of complex tasks, especially in image classification. Such architectures consist of a sequence of stacked layers with the aim of extracting useful information and having it processed by a classifier to make accurate predictions. However, intermediate information within such models is often left unused. In other cases, such as in edge computing contexts, these architectures are divided into multiple partitions that are made functional by including early exits, i.e. intermediate classifiers, with the goal of reducing the computational and temporal load without extremely compromising the accuracy of the classifications. In this paper, we present Anticipate, Ensemble and Prune (AEP), a new training technique based on weighted ensembles of early exits, which aims at exploiting the information in the structure of networks to maximise their performance. Through a comprehensive set of experiments, we show how the use of this approach can yield average accuracy improvements of up to 15% over traditional training. In its hybrid-weighted configuration, AEP's internal pruning operation also allows reducing the number of parameters by up to 41%, lowering the number of multiplications and additions by 18% and the latency time to make inference by 16%. By using AEP, it is also possible to learn weights that allow early exits to achieve better accuracy values than those obtained from single-output reference models.

View on arXiv PDF

Similar