通过双向交叉模式配对进行无偏偏参考表达式定位 (Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching) - 专知论文

会员服务 ·

0

Attention · 自顶向下 · 知识 (knowledge) · Learning · 自下而上 ·

2022 年 6 月 5 日

Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching

翻译：通过双向交叉模式配对进行无偏偏参考表达式定位

Hengcan Shi,Munawar Hayat,Jianfei Cai

from arxiv, 9 pages, 7 figures

Referring expression grounding is an important and challenging task in computer vision. To avoid the laborious annotation in conventional referring grounding, unpaired referring grounding is introduced, where the training data only contains a number of images and queries without correspondences. The few existing solutions to unpaired referring grounding are still preliminary, due to the challenges of learning image-text matching and lack of the top-down guidance with unpaired data. In this paper, we propose a novel bidirectional cross-modal matching (BiCM) framework to address these challenges. Particularly, we design a query-aware attention map (QAM) module that introduces top-down perspective via generating query-specific visual attention maps. A cross-modal object matching (COM) module is further introduced, which exploits the recently emerged image-text matching pretrained model, CLIP, to predict the target objects from a bottom-up perspective. The top-down and bottom-up predictions are then integrated via a similarity funsion (SF) module. We also propose a knowledge adaptation matching (KAM) module that leverages unpaired training data to adapt pretrained knowledge to the target dataset and task. Experiments show that our framework outperforms previous works by 6.55% and 9.94% on two popular grounding datasets.

翻译：在计算机视野中,为避免常规参考定位中艰难的描述,引入了不平坦的参考定位模块(QAM)模块,该模块通过生成特定查询的视觉关注地图引入自上而下的视角。还引入了一个交叉模式对象匹配模块(COM)模块,该模块利用最近出现的图像文本匹配匹配预选模型CLIP来从自下而下的角度预测目标对象。在本文件中,我们提出了一个新的双向双向跨模式匹配(BICM)框架,以应对这些挑战。我们还提议了一个知识匹配模块(KAM),该模块将前一至下方的数据转换到前一至前一至前一至后方的数据定位框架。我们还提议一个知识匹配模块(KAM),该模块将前一至下方的图像匹配(COM) 模块利用最近出现的图像文本匹配预培训模型(CLIP) 来利用最近出现的图像文本匹配前一至下方的模型(CLIP) 来预测目标对象。随后,自上至下而上和自下而上而上方的预测通过类似性复变(SF)模块整合。我们还提议了一个知识匹配(KAM) 匹配模块,该模块,该模块将前一至前一至前一至前一至前一至前一至二至二至二至四至二至三至四至四至四至四至四的模型,将数据测试数据显示前的模型,以前的模型显示前一至四至四至二至四至四至四至四至四至四至四至四四四四至四至四至四至四至四四的模型的模型的模型,以前的模型的模型的模型将数据调整数据。

0

相关内容

Attention

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

TRIB3基因表达对糖尿病大血管致纤维病变的作用及中药桃仁干预机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PSMA通过TRAF6和TTC3调控前列腺癌细胞自噬在CRPC产生过程中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

单链DNA结合蛋白WHIRLY1转录及表观遗传调控植物衰老和细胞死亡的研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

小麦TaERF4应答植物盐胁迫的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

大肠癌中DNA复制蛋白对双微体染色质的复制、损伤和修复的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

深海放线菌Streptomyces sp. SCSIO 03032抗肿瘤天然产物Spiroindimicins生物合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Arxiv

0+阅读 · 2022年7月22日

Rethinking the Reference-based Distinctive Image Captioning

Arxiv

0+阅读 · 2022年7月22日

Comprehensive Multi-Modal Interactions for Referring Image Segmentation

Arxiv

0+阅读 · 2022年7月21日

Correspondence Matters for Video Referring Expression Comprehension

Arxiv

0+阅读 · 2022年7月21日

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective

Arxiv

0+阅读 · 2022年7月21日

Image and Model Transformation with Secret Key for Vision Transformer

Arxiv

0+阅读 · 2022年7月21日

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Arxiv

0+阅读 · 2022年7月20日

Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression

Arxiv

0+阅读 · 2022年7月20日

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

Arxiv

0+阅读 · 2022年7月20日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Arxiv

0+阅读 · 2022年7月22日

Rethinking the Reference-based Distinctive Image Captioning

Arxiv

0+阅读 · 2022年7月22日

Comprehensive Multi-Modal Interactions for Referring Image Segmentation

Arxiv

0+阅读 · 2022年7月21日

Correspondence Matters for Video Referring Expression Comprehension

Arxiv

0+阅读 · 2022年7月21日

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective

Arxiv

0+阅读 · 2022年7月21日

Image and Model Transformation with Secret Key for Vision Transformer

Arxiv

0+阅读 · 2022年7月21日

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Arxiv

0+阅读 · 2022年7月20日

Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression

Arxiv

0+阅读 · 2022年7月20日

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

Arxiv

0+阅读 · 2022年7月20日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

相关基金

TRIB3基因表达对糖尿病大血管致纤维病变的作用及中药桃仁干预机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PSMA通过TRAF6和TTC3调控前列腺癌细胞自噬在CRPC产生过程中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

单链DNA结合蛋白WHIRLY1转录及表观遗传调控植物衰老和细胞死亡的研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

小麦TaERF4应答植物盐胁迫的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

大肠癌中DNA复制蛋白对双微体染色质的复制、损伤和修复的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

深海放线菌Streptomyces sp. SCSIO 03032抗肿瘤天然产物Spiroindimicins生物合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员