Confidence-gated training for efficient early-exit neural networks
This provides a practical solution for deploying deep models in resource-constrained environments, though it is incremental as it builds on existing early-exit methods.
The paper tackled the problem of gradient interference in early-exit neural networks, which reduces efficiency, by proposing Confidence-Gated Training (CGT) to conditionally propagate gradients, resulting in lower average inference cost and improved overall accuracy on benchmarks like Indian Pines and Fashion-MNIST.
Early-exit neural networks reduce inference cost by enabling confident predictions at intermediate layers. However, joint training often leads to gradient interference, with deeper classifiers dominating optimization. We propose Confidence-Gated Training (CGT), a paradigm that conditionally propagates gradients from deeper exits only when preceding exits fail. This encourages shallow classifiers to act as primary decision points while reserving deeper layers for harder inputs. By aligning training with the inference-time policy, CGT mitigates overthinking, improves early-exit accuracy, and preserves efficiency. Experiments on the Indian Pines and Fashion-MNIST benchmarks show that CGT lowers average inference cost while improving overall accuracy, offering a practical solution for deploying deep models in resource-constrained environments.