13.3LGMay 1
LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer InferenceShashank Kapadia, Deep Naryan Mishra, Sujal Reddy Alugubelli et al.
Layer-aligned distillation and convergence-based early exit represent two predominant computational efficiency paradigms for transformer inference; yet we establish that they exhibit systematic incompatibility under standard deployment conditions for convergence-based early exit. Distillation objectives that align intermediate student layers to teacher representations suppress the representational convergence that early-exit mechanisms exploit, rendering such mechanisms ineffective on distilled models. We introduce LEAP (Layer-wise Exit-Aware Pretraining), an auxiliary training objective that reconciles this incompatibility. LEAP requires no architectural modifications; it augments standard distillation with a single constraint ensuring intermediate layers approximate final-layer representations. LEAP-MiniLM achieves 1.61$\times$ measured wall-clock speedup (batch=1, NVIDIA L4) at $θ$=0.95, with 91.9% of samples exiting by layer 7 and 1.80$\times$ theoretical layer reduction, where standard distilled models achieve zero effective speedup. We validate across sentence similarity (STS-B: 0.760 $\pm$ 0.006) and retrieval benchmarks (BEIR), providing operational guidance including latency measurements, decision thresholds, and deployment criteria.
CVDec 23, 2024
Neural Spatial-Temporal Tensor Representation for Infrared Small Target DetectionFengyi Wu, Simin Liu, Haoan Wang et al.
Optimization-based approaches dominate infrared small target detection as they leverage infrared imagery's intrinsic low-rankness and sparsity. While effective for single-frame images, they struggle with dynamic changes in multi-frame scenarios as traditional spatial-temporal representations often fail to adapt. To address these challenges, we introduce a Neural-represented Spatial-Temporal Tensor (NeurSTT) model. This framework employs nonlinear networks to enhance spatial-temporal feature correlations in background approximation, thereby supporting target detection in an unsupervised manner. Specifically, we employ neural layers to approximate sequential backgrounds within a low-rank informed deep scheme. A neural three-dimensional total variation is developed to refine background smoothness while reducing static target-like clusters in sequences. Traditional sparsity constraints are incorporated into the loss functions to preserve potential targets. By replacing complex solvers with a deep updating strategy, NeurSTT simplifies the optimization process in a domain-awareness way. Visual and numerical results across various datasets demonstrate that our method outperforms detection challenges. Notably, it has 16.6$\times$ fewer parameters and averaged 19.19\% higher in $IoU$ compared to the suboptimal method on $256 \times 256$ sequences.