Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers
This work addresses rotation estimation for latency-sensitive applications, but it appears incremental as it builds on existing transformer-based methods for efficiency.
The paper tackles the problem of efficient and generalizable rotation estimation from RGB images by introducing Eff-GRot, which uses a transformer to predict object rotation in a single forward pass without object-specific training, achieving a balance between accuracy and computational efficiency.
We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass, without requiring object- or category-specific training. At the core of our framework is a transformer that performs a comparison in the latent space, jointly processing rotation-aware representations from multiple references alongside a query. This design enables a favorable balance between accuracy and computational efficiency while remaining simple, scalable, and fully end-to-end. Experimental results show that Eff-GRot offers a promising direction toward more efficient rotation estimation, particularly in latency-sensitive applications.