CVSep 10, 2025

Boosted Training of Lightweight Early Exits for Optimizing CNN Image Classification Inference

arXiv:2509.08318v1h-index: 6
Originality Incremental advance
AI Analysis

This work addresses efficiency-accuracy trade-offs for real-time image classification on resource-constrained platforms like embedded systems, offering practical gains for applications such as industrial inspection and UAV monitoring, though it is incremental over existing early-exit methods.

The paper tackles the problem of covariance shift in early-exit CNN training for image classification, where downstream branches are trained on full datasets but only process harder samples at inference, limiting efficiency-accuracy trade-offs. The proposed Boosted Training Scheme for Early Exits (BTS-EE) with lightweight branches and calibration achieves up to 45% reduction in computation with only 2% accuracy degradation on CINIC-10 with ResNet18.

Real-time image classification on resource-constrained platforms demands inference methods that balance accuracy with strict latency and power budgets. Early-exit strategies address this need by attaching auxiliary classifiers to intermediate layers of convolutional neural networks (CNNs), allowing "easy" samples to terminate inference early. However, conventional training of early exits introduces a covariance shift: downstream branches are trained on full datasets, while at inference they process only the harder, non-exited samples. This mismatch limits efficiency--accuracy trade-offs in practice. We introduce the Boosted Training Scheme for Early Exits (BTS-EE), a sequential training approach that aligns branch training with inference-time data distributions. Each branch is trained and calibrated before the next, ensuring robustness under selective inference conditions. To further support embedded deployment, we propose a lightweight branch architecture based on 1D convolutions and a Class Precision Margin (CPM) calibration method that enables per-class threshold tuning for reliable exit decisions. Experiments on the CINIC-10 dataset with a ResNet18 backbone demonstrate that BTS-EE consistently outperforms non-boosted training across 64 configurations, achieving up to 45 percent reduction in computation with only 2 percent accuracy degradation. These results expand the design space for deploying CNNs in real-time image processing systems, offering practical efficiency gains for applications such as industrial inspection, embedded vision, and UAV-based monitoring.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes