DERTR与空间变换共控 (Fast Convergence of DETR with Spatially Modulated Co-Attention) - 专知论文

会员服务 ·

0

Performer · Extensibility · 轮 · FAST · Better ·

2021 年 8 月 5 日

Fast Convergence of DETR with Spatially Modulated Co-Attention

翻译：DERTR与空间变换共控

Peng Gao,Minghang Zheng,Xiaogang Wang,Jifeng Dai,Hongsheng Li

from arxiv, Accepted by ICCV2021

The recently proposed Detection Transformer (DETR) model successfully applies Transformer to objects detection and achieves comparable performance with two-stage object detection frameworks, such as Faster-RCNN. However, DETR suffers from its slow convergence. Training DETR from scratch needs 500 epochs to achieve a high accuracy. To accelerate its convergence, we propose a simple yet effective scheme for improving the DETR framework, namely Spatially Modulated Co-Attention (SMCA) mechanism. The core idea of SMCA is to conduct location-aware co-attention in DETR by constraining co-attention responses to be high near initially estimated bounding box locations. Our proposed SMCA increases DETR's convergence speed by replacing the original co-attention mechanism in the decoder while keeping other operations in DETR unchanged. Furthermore, by integrating multi-head and scale-selection attention designs into SMCA, our fully-fledged SMCA can achieve better performance compared to DETR with a dilated convolution-based backbone (45.6 mAP at 108 epochs vs. 43.3 mAP at 500 epochs). We perform extensive ablation studies on COCO dataset to validate SMCA. Code is released at https://github.com/gaopengcuhk/SMCA-DETR .

翻译：最近提出的探测变异器(DETR)模型成功地将变异器应用于物体探测,并用两个阶段的物体探测框架(如Feair-RCNN)实现类似性能,如Seager-RCNN。然而,DETR的趋同速度缓慢。从零到零培训DETR需要500个小步才能达到高精度。为了加速其趋同速度,我们提出了一个简单而有效的计划来改进DETR框架,即空间移动的共振(SMCA)机制。SMCA的核心思想是,通过限制共同注意反应接近最初估计的捆绑框位置,使DER的一致性能达到可比水平。我们提议的SMCA提高了DETR的趋同速度,取代了Decoder的最初共同注意机制,同时使DTR的其他操作保持不变。此外,我们完全成熟的SMCA(SMCA)的功能可以比DETR(45.6 mAP 108 Eepochs vs. 43.3 mAP) 在500epto-Ambos 进行数据化的CO/CA/CARCMAS) 进行广泛的CO/SMACSMeleval化。我们在SMAC/MACSMSDUBSDUDUDUDUDS/SMAL

0

相关内容

Performer

KDD2021论文太多看不过来？这份《一句话点评398篇论文亮点》帮你快速找到想看的

专知会员服务

26+阅读 · 2021年8月16日

如何构建你的推荐系统？这份21页ppt教程为你讲解

如何构建你的推荐系统？这份21页ppt教程为你讲解

专知会员服务

65+阅读 · 2021年2月12日

中文预训练语言模型回顾

专知会员服务

34+阅读 · 2020年11月25日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Optimization for Oriented Object Detection via Representation Invariance Loss

Arxiv

0+阅读 · 2021年10月3日

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Arxiv

0+阅读 · 2021年10月2日

Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum

Arxiv

0+阅读 · 2021年10月1日

TCL: an ANN-to-SNN Conversion with Trainable Clipping Layers

Arxiv

3+阅读 · 2020年8月11日

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

Arxiv

3+阅读 · 2019年3月20日

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Arxiv

3+阅读 · 2018年11月27日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Arxiv

4+阅读 · 2018年4月14日

Multi-scale Location-aware Kernel Representation for Object Detection

Arxiv

5+阅读 · 2018年4月2日

Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

Arxiv

9+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

KDD2021论文太多看不过来？这份《一句话点评398篇论文亮点》帮你快速找到想看的

专知会员服务

26+阅读 · 2021年8月16日

如何构建你的推荐系统？这份21页ppt教程为你讲解

如何构建你的推荐系统？这份21页ppt教程为你讲解

专知会员服务

65+阅读 · 2021年2月12日

中文预训练语言模型回顾

专知会员服务

34+阅读 · 2020年11月25日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《解析陆域作战方向：一个概念性框架》报告

《人工智能与人类的未来》2025年最新300页书籍

追寻真正的AI自主性：从遗留思维到战场优势

《“蛛网”行动：乌克兰不对称作战的演进》报告

相关资讯

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Optimization for Oriented Object Detection via Representation Invariance Loss

Arxiv

0+阅读 · 2021年10月3日

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Arxiv

0+阅读 · 2021年10月2日

Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum

Arxiv

0+阅读 · 2021年10月1日

TCL: an ANN-to-SNN Conversion with Trainable Clipping Layers

Arxiv

3+阅读 · 2020年8月11日

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

Arxiv

3+阅读 · 2019年3月20日

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Arxiv

3+阅读 · 2018年11月27日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Arxiv

4+阅读 · 2018年4月14日

Multi-scale Location-aware Kernel Representation for Object Detection

Arxiv

5+阅读 · 2018年4月2日

Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

Arxiv

9+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员