F-VLM:在冻结的愿景和语言模型中进行开放词汇物体探测 (F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models) - 专知论文

会员服务 ·

0

语言模型化 · Vision · MoDELS · 目标检测 · SimPLe ·

2022 年 9 月 30 日

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

翻译：F-VLM:在冻结的愿景和语言模型中进行开放词汇物体探测

Weicheng Kuo,Yin Cui,Xiuye Gu,AJ Piergiovanni,Anelia Angelova

from arxiv, 19 pages, 6 figures

We present F-VLM, a simple open-vocabulary object detection method built upon Frozen Vision and Language Models. F-VLM simplifies the current multi-stage training pipeline by eliminating the need for knowledge distillation or detection-tailored pretraining. Surprisingly, we observe that a frozen VLM: 1) retains the locality-sensitive features necessary for detection, and 2) is a strong region classifier. We finetune only the detector head and combine the detector and VLM outputs for each region at inference time. F-VLM shows compelling scaling behavior and achieves +6.5 mask AP improvement over the previous state of the art on novel categories of LVIS open-vocabulary detection benchmark. In addition, we demonstrate very competitive results on COCO open-vocabulary detection benchmark and cross-dataset transfer detection, in addition to significant training speed-up and compute savings. Code will be released.

翻译：我们提出F-VLM,这是建立在冷冻视觉和语言模型基础上的简单开放词汇物体探测方法。F-VLM简化了目前的多阶段培训管道,消除了对知识蒸馏或检测专用预培训的需求。令人惊讶的是,我们观察到一个冷冻的VLM:1)保留了探测所需的对地点敏感的特征,2)是一个强大的区域分类器。我们只对探测器头进行微调,并在推断时将每个区域的检测器和VLM产出结合起来。F-VLM展示了令人信服的缩放行为,并实现了超过LVIS开放词汇探测基准新类别艺术水平的+6.5 AP改进。此外,除了大量培训加速和计算节省外,我们还展示了CO-O-O-O-VBARy探测基准和交叉数据集传输检测的竞争性结果。代码将发布。

0

相关内容

语言模型化

语言模型化

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

EAST钨瓦块棱角热负荷性能改善的计算模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

微小RNA-375降低心肌成纤维细胞IL-33表达在糖尿病心肌病发病中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Arxiv

0+阅读 · 2022年11月6日

Contextual information integration for stance detection via cross-attention

Contextual information integration for stance detection via cross-attention

Arxiv

0+阅读 · 2022年11月3日

A Survey on Deep Domain Adaptation and Tiny Object Detection Challenges, Techniques and Datasets

Arxiv

17+阅读 · 2021年7月16日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

因果强化学习的统一框架：综述、分类体系、算法与应用

《无人机系统 - 反无人机系统：测试方法》364页

【MIT博士论文】语言模型的推理时学习算法

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Arxiv

0+阅读 · 2022年11月6日

Contextual information integration for stance detection via cross-attention

Contextual information integration for stance detection via cross-attention

Arxiv

0+阅读 · 2022年11月3日

A Survey on Deep Domain Adaptation and Tiny Object Detection Challenges, Techniques and Datasets

Arxiv

17+阅读 · 2021年7月16日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

相关基金

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

EAST钨瓦块棱角热负荷性能改善的计算模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

利用小鼠模型研究lrrc10与desmin在心肌肥大发生中的协同调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

微小RNA-375降低心肌成纤维细胞IL-33表达在糖尿病心肌病发病中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员