LGCLCVFeb 2, 2025

BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts

arXiv:2502.00745v16 citationsh-index: 16Has CodeICLR
AI Analysis

This work addresses inference efficiency for deep learning applications, offering an incremental improvement over existing early exit methods.

The paper tackles the problem of reducing inference latency in deep neural networks using early exit techniques by proposing a new decision criterion that aggregates confidence scores from multiple exit classifiers as experts, achieving speed-ups of 1.5x to 2.1x on COCO and GLUE datasets while maintaining comparable or improved accuracy.

Early Exit (EE) techniques have emerged as a means to reduce inference latency in Deep Neural Networks (DNNs). The latency improvement and accuracy in these techniques crucially depend on the criteria used to make exit decisions. We propose a new decision criterion where exit classifiers are treated as experts BEEM and aggregate their confidence scores. The confidence scores are aggregated only if neighbouring experts are consistent in prediction as the samples pass through them, thus capturing their ensemble effect. A sample exits when the aggregated confidence value exceeds a threshold. The threshold is set using the error rates of the intermediate exits aiming to surpass the performance of conventional DNN inference. Experimental results on the COCO dataset for Image captioning and GLUE datasets for various language tasks demonstrate that our method enhances the performance of state-of-the-art EE methods, achieving improvements in speed-up by a factor 1.5x to 2.1x. When compared to the final layer, its accuracy is comparable in harder Image Captioning and improves in the easier language tasks. The source code for this work is publicly available at https://github.com/Div290/BEEM1/tree/main

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes