CV LGApr 4

RÃ©nyi Attention Entropy for Patch Pruning

arXiv:2604.038031.8h-index: 1

AI Analysis

For vision Transformers, this work offers an adaptive pruning criterion that improves the accuracy-computation trade-off, though it is incremental over existing entropy-based methods.

The authors propose a patch pruning method for Transformers using Rényi entropy of attention distributions to identify redundant patches, reducing computation while preserving accuracy on fine-grained image recognition tasks.

Transformers are strong baselines in both vision and language because self-attention captures long-range dependencies across tokens. However, the cost of self-attention grows quadratically with the number of tokens. Patch pruning mitigates this cost by estimating per-patch importance and removing redundant patches. To identify informative patches for pruning, we introduce a criterion based on the Shannon entropy of the attention distribution. Low-entropy patches, which receive selective and concentrated attention, are kept as important, while high-entropy patches with attention spread across many locations are treated as redundant. We also extend the criterion from Shannon to RÃ©nyi entropy, which emphasizes sharp attention peaks and supports pruning strategies that adapt to task needs and computational limits. In experiments on fine-grained image recognition, where patch selection is critical, our method reduced computation while preserving accuracy. Moreover, adjusting the pruning policy through the RÃ©nyi entropy measure yields further gains and improves the trade-off between accuracy and computation.

View on arXiv PDF

Similar