Prompt生成网络用于基于输入的冻结视觉Transformer自适应 (Prompt Generation Networks for Input-based Adaptation of Frozen Vision Transformers) - 专知论文

会员服务 ·

0

生成网络 · 自适应 · Prompt · 视觉Transformer · 微调 ·

2023 年 4 月 19 日

Prompt Generation Networks for Input-based Adaptation of Frozen Vision Transformers

翻译：Prompt生成网络用于基于输入的冻结视觉Transformer自适应

Jochem Loedeman,Maarten C. Stol,Tengda Han,Yuki M. Asano

from arxiv, Tech report, 12 pages. Code: https://github.com/jochemloedeman/PGN

With the introduction of the transformer architecture in computer vision, increasing model scale has been demonstrated as a clear path to achieving performance and robustness gains. However, with model parameter counts reaching the billions, classical finetuning approaches are becoming increasingly limiting and even unfeasible when models become hosted as inference APIs, as in NLP. To this end, visual prompt learning, whereby a model is adapted by learning additional inputs, has emerged as a potential solution for adapting frozen and cloud-hosted models: During inference, this neither requires access to the internals of models' forward pass function, nor requires any post-processing. In this work, we propose the Prompt Generation Network (PGN) that generates high performing, input-dependent prompts by sampling from an end-to-end learned library of tokens. We further introduce the "prompt inversion" trick, with which PGNs can be efficiently trained in a latent space but deployed as strictly input-only prompts for inference. We show the PGN is effective in adapting pre-trained models to various new datasets: It surpasses previous methods by a large margin on 12/12 datasets and even outperforms full-finetuning on 5/12, while requiring 100x less parameters.

翻译：随着Transformer架构在计算机视觉领域的引入，增加模型规模已被证明是实现性能和鲁棒性提升的明确途径。然而，随着模型参数数量达到数十亿，经典的微调方法在模型成为推理API（如NLP）的情况下变得越来越有限，甚至无法实现。为此，视觉提示学习被提出作为一种潜在的解决方案，用于自适应冻结的且云主机模型，即在推理期间，这既不需要访问模型前向传递函数的内部，也不需要任何后处理。在这项工作中，我们提出了Prompt生成网络（PGN），它通过从端到端学习的令牌库中进行采样，生成高性能的、依赖于输入的提示。我们进一步引入了“提示反演”技巧，采用这种技巧可以在潜空间中高效地训练PGN，但在推理时仅部署为严格的输入-输出提示。我们展示了PGN在适应预训练模型到各种新数据集方面的效果：它在12/12数据集上超过了以前的方法，甚至在5/12上优于完整微调，同时需要100倍少的参数。

0

相关内容

生成网络

【KDD2022】掩码与推理: 用于复杂逻辑查询的预训练知识图谱Transformers

【KDD2022】掩码与推理: 用于复杂逻辑查询的预训练知识图谱Transformers

专知会员服务

29+阅读 · 2022年8月12日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【NeurIPS 2020】视觉和语言表示学习的大规模对抗性训练

【NeurIPS 2020】视觉和语言表示学习的大规模对抗性训练

专知会员服务

15+阅读 · 2020年10月27日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

专知会员服务

29+阅读 · 2020年4月6日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

机器之心

0+阅读 · 2022年11月27日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

基于射频干涉的无线传感器网络目标节点定位与跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

用于GEM探测器的高集成度专用集成电路研制

国家自然科学基金

2+阅读 · 2013年12月31日

基于静息态和任务态的脑网络连接性fMRI研究运动想象训练促进皮层下脑卒中患者功能恢复的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

嵌入式多核环境中分区操作系统关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多尺度边缘感知的图像平滑和分层编辑研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

基于ISVM及VR的脑-机交互适应性研究

国家自然科学基金

0+阅读 · 2009年12月31日

异步低功耗LDPC解码器设计

国家自然科学基金

0+阅读 · 2009年12月31日

超宽带嵌入式变比特率音频编码算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Arxiv

0+阅读 · 2023年6月5日

Adaptive and Personalized Exercise Generation for Online Language Learning

Arxiv

0+阅读 · 2023年6月4日

Is Generative Modeling-based Stylization Necessary for Domain Adaptation in Regression Tasks?

Arxiv

0+阅读 · 2023年6月2日

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

Arxiv

0+阅读 · 2023年6月2日

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Arxiv

0+阅读 · 2023年6月1日

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

Arxiv

0+阅读 · 2023年6月1日

SQL-PaLM: Improved Large Language ModelAdaptation for Text-to-SQL

Arxiv

0+阅读 · 2023年5月26日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

视觉Transformer

相关VIP内容

【KDD2022】掩码与推理: 用于复杂逻辑查询的预训练知识图谱Transformers

【KDD2022】掩码与推理: 用于复杂逻辑查询的预训练知识图谱Transformers

专知会员服务

29+阅读 · 2022年8月12日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【NeurIPS 2020】视觉和语言表示学习的大规模对抗性训练

【NeurIPS 2020】视觉和语言表示学习的大规模对抗性训练

专知会员服务

15+阅读 · 2020年10月27日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

专知会员服务

29+阅读 · 2020年4月6日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】用于可扩展持续强化学习的自组合策略

图结构遇上智能体：分类方法、研究进展与未来机遇

2024年军事智能领域科技发展综述

【HKUST博士论文】知识图谱推理的进展：复杂查询应答与逻辑假设生成的创新方法

相关资讯

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

机器之心

0+阅读 · 2022年11月27日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Arxiv

0+阅读 · 2023年6月5日

Adaptive and Personalized Exercise Generation for Online Language Learning

Arxiv

0+阅读 · 2023年6月4日

Is Generative Modeling-based Stylization Necessary for Domain Adaptation in Regression Tasks?

Arxiv

0+阅读 · 2023年6月2日

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

Arxiv

0+阅读 · 2023年6月2日

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Arxiv

0+阅读 · 2023年6月1日

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

Arxiv

0+阅读 · 2023年6月1日

SQL-PaLM: Improved Large Language ModelAdaptation for Text-to-SQL

Arxiv

0+阅读 · 2023年5月26日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

相关基金

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

基于射频干涉的无线传感器网络目标节点定位与跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

用于GEM探测器的高集成度专用集成电路研制

国家自然科学基金

2+阅读 · 2013年12月31日

基于静息态和任务态的脑网络连接性fMRI研究运动想象训练促进皮层下脑卒中患者功能恢复的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

嵌入式多核环境中分区操作系统关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多尺度边缘感知的图像平滑和分层编辑研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

基于ISVM及VR的脑-机交互适应性研究

国家自然科学基金

0+阅读 · 2009年12月31日

异步低功耗LDPC解码器设计

国家自然科学基金

0+阅读 · 2009年12月31日

超宽带嵌入式变比特率音频编码算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员