We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass, without requiring object- or category-specific training. At the core of our framework is a transformer that performs a comparison in the latent space, jointly processing rotation-aware representations from multiple references alongside a query. This design enables a favorable balance between accuracy and computational efficiency while remaining simple, scalable, and fully end-to-end. Experimental results show that Eff-GRot offers a promising direction toward more efficient rotation estimation, particularly in latency-sensitive applications.
翻译:本文提出Eff-GRot,一种从RGB图像进行高效且可泛化的旋转估计方法。给定一张查询图像和一组已知朝向的参考图像,本方法无需针对特定物体或类别进行训练,即可通过单次前向传播直接预测物体的旋转。我们框架的核心是一个在隐空间进行比较的Transformer,它能够联合处理来自多张参考图像的旋转感知表征与查询图像。该设计在保持结构简洁、可扩展且完全端到端的同时,实现了精度与计算效率之间的良好平衡。实验结果表明,Eff-GRot为更高效的旋转估计(尤其是在对延迟敏感的应用中)提供了一个有前景的研究方向。