LGMLMay 28, 2018

Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence

arXiv:1805.10982v29 citations
Originality Incremental advance
AI Analysis

This addresses the problem of high computational cost in deep learning inference for users needing efficiency, though it is incremental as it builds on existing cascade and confidence-based methods.

The paper tackles the tradeoff between computational effort and classification accuracy in deep neural networks by introducing a cascade inference method that uses softmax confidence thresholds to terminate early, achieving a 15%-50% reduction in MAC operations with only about 1% accuracy degradation.

We study the tradeoff between computational effort and classification accuracy in a cascade of deep neural networks. During inference, the user sets the acceptable accuracy degradation which then automatically determines confidence thresholds for the intermediate classifiers. As soon as the confidence threshold is met, inference terminates immediately without having to compute the output of the complete network. Confidence levels are derived directly from the softmax outputs of intermediate classifiers, as we do not train special decision functions. We show that using a softmax output as a confidence measure in a cascade of deep neural networks leads to a reduction of 15%-50% in the number of MAC operations while degrading the classification accuracy by roughly 1%. Our method can be easily incorporated into pre-trained non-cascaded architectures, as we exemplify on ResNet. Our main contribution is a method that dynamically adjusts the tradeoff between accuracy and computation without retraining the model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes