CVJun 11, 2025

Attention, Please! Revisiting Attentive Probing Through the Lens of Efficiency

Bill Psomas, Dionysis Christopoulos, Eirini Baltzi, Ioannis Kakogeorgiou, Tilemachos Aravanis, Nikos Komodakis, Konstantinos Karantzalos, Yannis Avrithis, Giorgos Tolias

arXiv:2506.10178v213.17 citationsh-index: 43Has Code

Originality Incremental advance

AI Analysis

This work addresses the computational inefficiency in probing protocols for researchers evaluating large-scale pre-trained models, offering an incremental improvement over existing attentive methods.

The paper tackled the inefficiency of existing attentive probing methods for evaluating pre-trained models by proposing efficient probing (EP), a multi-query cross-attention mechanism that reduces trainable parameters and outperforms prior methods across seven benchmarks with strong low-shot and layer-wise gains.

As fine-tuning becomes increasingly impractical at scale, probing is emerging as the preferred evaluation protocol. Yet, the standard linear probing fails to adequately reflect the potential of models whose pre-training optimizes representations of patch tokens rather than an explicit global representation. This motivates the need for attentive probing, an alternative that uses attention to selectively aggregate patch-level features. Despite its growing adoption, attentive probing remains under-explored, with existing methods suffering from excessive parameterization and poor computational efficiency. In this work, we revisit attentive probing through the lens of the accuracy vs. parameter efficiency trade-off. We present the first comprehensive study of existing methods, analyzing their design choices and benchmarking their performance. Building on this, we propose efficient probing (EP), a simple yet effective multi-query cross-attention mechanism that eliminates redundant projections and reduces the number of trainable parameters. Despite its simplicity, EP outperforms linear probing and prior attentive probing approaches across seven benchmarks, generalizes well to diverse pre-training paradigms, and delivers strong low-shot and layer-wise gains. Beyond evaluation, our analysis uncovers emerging properties of EP, such as complementary attention maps, which open new directions for leveraging probing beyond protocol design. Code available at https://github.com/billpsomas/efficient-probing.

View on arXiv PDF Code

Similar