推动愿景和语言基础模型的语文软件 (Language-Aware Soft Prompting for Vision & Language Foundation Models) - 专知论文

会员服务 ·

0

Prompt · SOFT · Learning · 类别 · MoDELS ·

2022 年 10 月 3 日

Language-Aware Soft Prompting for Vision & Language Foundation Models

翻译：推动愿景和语言基础模型的语文软件

Adrian Bulat,Georgios Tzimiropoulos

This paper is on soft prompt learning for Vision \& Language (V&L) models. Similarly to their NLP counterparts, V\&L models can be adapted to a downstream task by learning soft continuous prompts using a few training examples. Current methods learn the soft prompts by minimizing a cross-entropy loss using as class weights the features obtained by passing the prompts plus the class names through the text encoder. Such methods, however, significantly overfit the training data suffering from large accuracy degradation when tested on unseen classes from the same domain. Our main contribution, in this paper, is a surprisingly simple approach to alleviate this problem: we use a second cross entropy loss to minimize the distance between the learned soft prompts and a set of hand-engineered manual prompts (obtained by prompt engineering). The proposed loss can be interpreted in multiple ways including as a regularizer, as a means for language-based augmentation, and as a way of learning more discriminative class centroids. Importantly, our formulation is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available, further increasing the robustness of the learned prompts. Through extensive evaluations on 11 datasets, we show that our approach (a) significantly outperforms all prior works on soft prompting, and (b) matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for the majority of the test datasets. Code will be made available.

翻译：本文是关于视觉语言( V&L) 模型的软快速学习。与 NLP 模型相似, V ⁇ L 模型可以通过使用几个培训实例学习软连续的软连续提示, 适应下游任务。目前的方法可以学习软快速, 将通过通过文本编码器传递的提示和类名获得的特征作为课堂重量, 最大限度地减少交叉热带损失。但是, 这种方法大大超出了在同一域的无法见的类中测试导致的高度精确降解的培训数据。我们的主要贡献, 与 NLP 对应者类似, 我们的主要贡献, 是缓解这一问题的一个令人惊讶的简单方法 : 我们使用第二个交叉恒星损失, 以最大限度地减少学习的软软连续提示和手工设计手册提示之间的距离。提议的损失可以用多种方式来解释, 包括一个常规化, 作为基于语言的增强手段, 以及学习更具有歧视性的类类缩略图。重要的是, 我们的配方在培训、虚拟课程中, 也就是没有视觉样本的班级名称, 将进一步提高我们所学的准确性。

0

相关内容

Prompt

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

微纳尺度多孔介质中气体运移机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

异甘草素调控糖代谢诱导黑色素瘤细胞凋亡机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IL-35在动脉粥样硬化进程中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR信号通路对DNA双链断裂损伤修复的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

汽车撞击引起的桥梁振动对高速列车运行安全的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

动静载荷作用下基于压缩感知域InSAR时间序列分析监测京津高铁沿线地面沉降

国家自然科学基金

0+阅读 · 2012年12月31日

胰岛素受体结合G蛋白与胰岛素1相分泌损害关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

纤维连接蛋白对间充质干细胞移植后存活和命运的影响

国家自然科学基金

0+阅读 · 2009年12月31日

混凝土桥梁构件耐久性数值模拟

国家自然科学基金

0+阅读 · 2008年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

Arxiv

0+阅读 · 2022年11月7日

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Arxiv

0+阅读 · 2022年11月6日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

0+阅读 · 2022年11月6日

CPL: Counterfactual Prompt Learning for Vision and Language Models

Arxiv

0+阅读 · 2022年11月5日

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

Arxiv

0+阅读 · 2022年11月4日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《解析陆域作战方向：一个概念性框架》报告

《人工智能与人类的未来》2025年最新300页书籍

追寻真正的AI自主性：从遗留思维到战场优势

《“蛛网”行动：乌克兰不对称作战的演进》报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

Arxiv

0+阅读 · 2022年11月7日

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Arxiv

0+阅读 · 2022年11月6日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

0+阅读 · 2022年11月6日

CPL: Counterfactual Prompt Learning for Vision and Language Models

Arxiv

0+阅读 · 2022年11月5日

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

Arxiv

0+阅读 · 2022年11月4日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

相关基金

微纳尺度多孔介质中气体运移机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

异甘草素调控糖代谢诱导黑色素瘤细胞凋亡机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IL-35在动脉粥样硬化进程中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR信号通路对DNA双链断裂损伤修复的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

汽车撞击引起的桥梁振动对高速列车运行安全的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

动静载荷作用下基于压缩感知域InSAR时间序列分析监测京津高铁沿线地面沉降

国家自然科学基金

0+阅读 · 2012年12月31日

胰岛素受体结合G蛋白与胰岛素1相分泌损害关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

纤维连接蛋白对间充质干细胞移植后存活和命运的影响

国家自然科学基金

0+阅读 · 2009年12月31日

混凝土桥梁构件耐久性数值模拟

国家自然科学基金

0+阅读 · 2008年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员