可变换的ConvNets v2:更变换、更好的结果 (Deformable ConvNets v2: More Deformable, Better Results) - 专知论文

会员服务 ·

0

ConvNets · Performer · Better · Networking · MoDELS ·

2018 年 11 月 27 日

Deformable ConvNets v2: More Deformable, Better Results

翻译：可变换的ConvNets v2:更变换、更好的结果

Xizhou Zhu,Han Hu,Stephen Lin,Jifeng Dai

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects. Through an examination of its adaptive behavior, we observe that while the spatial support for its neural features conforms more closely than regular ConvNets to object structure, this support may nevertheless extend well beyond the region of interest, causing features to be influenced by irrelevant image content. To address this problem, we present a reformulation of Deformable ConvNets that improves its ability to focus on pertinent image regions, through increased modeling power and stronger training. The modeling power is enhanced through a more comprehensive integration of deformable convolution within the network, and by introducing a modulation mechanism that expands the scope of deformation modeling. To effectively harness this enriched modeling capability, we guide network training via a proposed feature mimicking scheme that helps the network to learn features that reflect the object focus and classification power of R-CNN features. With the proposed contributions, this new version of Deformable ConvNets yields significant performance gains over the original model and produces leading results on the COCO benchmark for object detection and instance segmentation.

翻译：变形变异网络的优异性能来自其适应物体的几何变化的能力。通过对其适应行为的审查,我们观察到,虽然对其神经功能的空间支持比常规的ConvNet更接近于目标结构,但这种支持可能远远超出感兴趣的区域,造成不相干图像内容的影响。为解决这一问题,我们提出了变形变异的ConvNet的重新组合,通过增加建模力和强化培训,提高了其关注相关图像区域的能力。通过在网络内更全面地整合变形变异性变异功能,并通过引入一个调制机制,扩大变形模型的范围,加强了模型的力量。为有效利用这种变异模型能力,我们通过一个拟议的功能模拟方案指导网络培训,帮助网络学习反映R-CNN特性的物体焦点和分类能力。根据拟议的贡献,变形的ConvNet新版本在原始模型上取得了显著的绩效收益,并在COCO物体探测和图像断段基准上产生了领先的成果。

4

相关内容

ConvNets

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

36+阅读 · 2020年2月27日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Faster R-CNN

数据挖掘入门与实战

4+阅读 · 2018年4月20日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

机器学习研究会

6+阅读 · 2017年8月5日

Deformable Style Transfer

Deformable Style Transfer

Arxiv

14+阅读 · 2020年3月24日

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

Arxiv

6+阅读 · 2019年5月3日

Selective Kernel Networks

Arxiv

3+阅读 · 2019年3月15日

Learning Discriminative Motion Features Through Detection

Learning Discriminative Motion Features Through Detection

Arxiv

3+阅读 · 2018年12月11日

CapsuleGAN: Generative Adversarial Capsule Network

Arxiv

4+阅读 · 2018年9月25日

MaskReID: A Mask Based Deep Ranking Neural Network for Person Re-identification

Arxiv

8+阅读 · 2018年4月11日

Question Type Guided Attention in Visual Question Answering

Arxiv

5+阅读 · 2018年4月6日

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Arxiv

3+阅读 · 2018年1月18日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

SSD: Single Shot MultiBox Detector

Arxiv

5+阅读 · 2016年12月29日

VIP会员

文章信息

相关主题

相关VIP内容

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

36+阅读 · 2020年2月27日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

[ICCV2025]EAMamba：面向图像恢复的高效全能视觉状态空间模型

ICCV 2025 | 超越π0，无界智慧提出A0，首个空间可供性感知的通用操作模型

【博士论文】大规模人工智能中的强化学习智能体：高效训练与更严谨分析

大语言模型推理系统综述

相关资讯

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Faster R-CNN

数据挖掘入门与实战

4+阅读 · 2018年4月20日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

机器学习研究会

6+阅读 · 2017年8月5日

相关论文

Deformable Style Transfer

Deformable Style Transfer

Arxiv

14+阅读 · 2020年3月24日

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

Arxiv

6+阅读 · 2019年5月3日

Selective Kernel Networks

Arxiv

3+阅读 · 2019年3月15日

Learning Discriminative Motion Features Through Detection

Learning Discriminative Motion Features Through Detection

Arxiv

3+阅读 · 2018年12月11日

CapsuleGAN: Generative Adversarial Capsule Network

Arxiv

4+阅读 · 2018年9月25日

MaskReID: A Mask Based Deep Ranking Neural Network for Person Re-identification

Arxiv

8+阅读 · 2018年4月11日

Question Type Guided Attention in Visual Question Answering

Arxiv

5+阅读 · 2018年4月6日

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Arxiv

3+阅读 · 2018年1月18日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

SSD: Single Shot MultiBox Detector

Arxiv

5+阅读 · 2016年12月29日

微信扫码咨询专知VIP会员