The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection. However, DETR suffers from slow training convergence, which hinders its applicability to various detection tasks. We observe that DETR's slow convergence is largely attributed to the difficulty in matching object queries to relevant regions due to the unaligned semantics between object queries and encoded image features. With this observation, we design Semantic-Aligned-Matching DETR++ (SAM-DETR++) to accelerate DETR's convergence and improve detection performance. The core of SAM-DETR++ is a plug-and-play module that projects object queries and encoded image features into the same feature embedding space, where each object query can be easily matched to relevant regions with similar semantics. Besides, SAM-DETR++ searches for multiple representative keypoints and exploits their features for semantic-aligned matching with enhanced representation capacity. Furthermore, SAM-DETR++ can effectively fuse multi-scale features in a coarse-to-fine manner on the basis of the designed semantic-aligned matching. Extensive experiments show that the proposed SAM-DETR++ achieves superior convergence speed and competitive detection accuracy. Additionally, as a plug-and-play method, SAM-DETR++ can complement existing DETR convergence solutions with even better performance, achieving 44.8% AP with merely 12 training epochs and 49.1% AP with 50 training epochs on COCO val2017 with ResNet-50. Codes are available at https://github.com/ZhangGongjie/SAM-DETR .
翻译:最近提议的变异变异仪( DETR) 建立了完全端到端的物体探测范式( DETR) 。 但是, DETR++ 是一个插件和剧本模块, 用于将查询和编码图像特性植入同一嵌入ZZ的功能, 每个对象查询很容易与具有类似语义的相关区域匹配。 此外, SAM- DETR++ 搜索多个具有代表性的密钥点,并利用其特征与增强的表达能力进行语义匹配。 此外, SAM- DETR++ 可以将多种规模特性有效地结合到一个不完全的调和功能上。 SAM- TR++ 的核心是一个插件和游戏模块,用于将查询和编码图像特性植入同一功能嵌入Z的功能,每个对象查询很容易与相关区域相匹配。 此外, SAM- DETR++ 搜索多个具有代表性的关键点, 利用它们的特性与增强的演示能力匹配。 SAM- DETR++++, 在设计中, SAM- 测试S- demal- destal- destal dreal destalation Settilation Sy- dem- dem- 和 Sam- dregillation Sam- dregillation SAM- 。