PETR: 用于多视图三维天体探测的嵌入式位置变换 (PETR: Position Embedding Transformation for Multi-View 3D Object Detection) - 专知论文

会员服务 ·

0

位置嵌入 · Performer · 3D · 目标检测 · 变换 ·

2022 年 7 月 19 日

PETR: Position Embedding Transformation for Multi-View 3D Object Detection

翻译：PETR: 用于多视图三维天体探测的嵌入式位置变换

Yingfei Liu,Tiancai Wang,Xiangyu Zhang,Jian Sun

from arxiv, Accepted by ECCV 2022. Code is available at \url{https://github.com/megvii-research/PETR}

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at \url{https://github.com/megvii-research/PETR}.

翻译：在本文中,我们为多视图 3D 对象探测开发嵌入转换位置(PETR) 。 PETR 将 3D 坐标的定位信息编码为图像特征, 生成 3D 位置观测特征。对象查询可以感知 3D 位置观测特征, 并进行端到端的物体探测。 PETR 在标准 nuScenes 数据集上达到最先进的性能( 50.4% NDS 和 44.1% mAP ), 并在基准中排在第1位位。它可以作为未来研究的简单而有力的基准。代码可在 url{https://github.com/megving-research/PETR}查阅。

0

相关内容

位置嵌入

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

专知

17+阅读 · 2018年4月19日

模糊配体衍生的低价锕系金属有机化合物的合成、结构及反应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于聚合物修饰碳点的蛋白质荧光传感器阵列

国家自然科学基金

0+阅读 · 2013年12月31日

SpTrz2调控粟酒裂殖酵母线粒体介导的细胞凋亡机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于关键词的关系数据库查询技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection

Arxiv

0+阅读 · 2022年9月14日

A Benchmark and a Baseline for Robust Multi-view Depth Estimation

Arxiv

0+阅读 · 2022年9月13日

CenterFormer: Center-based Transformer for 3D Object Detection

Arxiv

0+阅读 · 2022年9月12日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向真实世界音视联合语音识别的可扩展框架

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

评估大语言模型在科学发现中的作用

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

专知

17+阅读 · 2018年4月19日

相关论文

Viewer-Centred Surface Completion for Unsupervised Domain Adaptation in 3D Object Detection

Arxiv

0+阅读 · 2022年9月14日

A Benchmark and a Baseline for Robust Multi-view Depth Estimation

Arxiv

0+阅读 · 2022年9月13日

CenterFormer: Center-based Transformer for 3D Object Detection

Arxiv

0+阅读 · 2022年9月12日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

相关基金

模糊配体衍生的低价锕系金属有机化合物的合成、结构及反应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于聚合物修饰碳点的蛋白质荧光传感器阵列

国家自然科学基金

0+阅读 · 2013年12月31日

SpTrz2调控粟酒裂殖酵母线粒体介导的细胞凋亡机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于关键词的关系数据库查询技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员