FreeLM: Fine-Tuning-Free Language Model (FreeLM: Fine-Tuning-Free Language Model) - 专知论文

会员服务 ·

0

微调 · 预训练 · 语言模型化 · 语言模型 · tuning ·

2023 年 5 月 2 日

FreeLM: Fine-Tuning-Free Language Model

翻译：FreeLM: Fine-Tuning-Free Language Model

Xiang Li,Xin Jiang,Xuying Meng,Aixin Sun,Yequan Wang

Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.

翻译：自由语言模型：无微调预训练模型预训练语言模型在NLP任务中取得了显着的成功。然而，主流的解决方案在很大程度上遵循预训练然后微调的范式，这既带来了高部署成本，也降低了训练效率。尽管如此，在特定任务上微调是必不可少的，因为PLMs只是用来自大型原始数据的语言信号进行预训练的。在本文中，我们提出了一种新颖的无微调预训练语言模型策略，旨在考虑语言信号和示范信号。示范信号是下游任务的抽象，以统一的命题格式提供。我们的FreeLM模型进行了语言和强任务感知的示范信号交互式训练，表现出强大的泛化性和鲁棒性。在实验中，FreeLM在一系列语言理解任务上均优于大型模型（如GPT-3和InstructGPT）。相比于这些模型的175B参数，FreeLM小得多，仅有0.3B参数。

0

相关内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

自然语言处理顶会EMNLP2020接受论文出炉！754篇录用！哈工大SCIR九篇长文被接受

自然语言处理顶会EMNLP2020接受论文出炉！754篇录用！哈工大SCIR九篇长文被接受

专知会员服务

34+阅读 · 2020年9月17日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

打开模型Zero-Shot新范式：Instruction Tuning

打开模型Zero-Shot新范式：Instruction Tuning

PaperWeekly

2+阅读 · 2022年8月25日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

LKB1-p53-p21/WAF1途径在富氢置换液CRRT治疗急性肾脏缺血/再灌注损伤中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

量子Ising模型中Kibble-Zurek机制的量子模拟实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于原子系综的低噪声量子精密测量

国家自然科学基金

0+阅读 · 2014年12月31日

面向森林监护的敏捷高光谱影像获取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

APE1线粒体调控ROS介导骨肉瘤化疗耐药的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于里德堡原子偶极封锁效应的量子相干操控

国家自然科学基金

0+阅读 · 2012年12月31日

多比特量子门的实现和量子信息处理

国家自然科学基金

0+阅读 · 2011年12月31日

ROS介导的APE-1和PI3K/Akt信号通路对H.pylori诱导胃上皮细胞凋亡增殖的作用

国家自然科学基金

0+阅读 · 2011年12月31日

晶体塑性模型的改进及其在微成形工艺研究中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

Arxiv

0+阅读 · 2023年6月16日

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Arxiv

0+阅读 · 2023年6月16日

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Arxiv

0+阅读 · 2023年6月14日

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Arxiv

0+阅读 · 2023年6月14日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

Advances in Multi-turn Dialogue Comprehension: A Survey

Arxiv

23+阅读 · 2021年10月11日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

自然语言处理顶会EMNLP2020接受论文出炉！754篇录用！哈工大SCIR九篇长文被接受

自然语言处理顶会EMNLP2020接受论文出炉！754篇录用！哈工大SCIR九篇长文被接受

专知会员服务

34+阅读 · 2020年9月17日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

打开模型Zero-Shot新范式：Instruction Tuning

打开模型Zero-Shot新范式：Instruction Tuning

PaperWeekly

2+阅读 · 2022年8月25日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

相关论文

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

Arxiv

0+阅读 · 2023年6月16日

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Arxiv

0+阅读 · 2023年6月16日

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Arxiv

0+阅读 · 2023年6月14日

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Arxiv

0+阅读 · 2023年6月14日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

Advances in Multi-turn Dialogue Comprehension: A Survey

Arxiv

23+阅读 · 2021年10月11日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

相关基金

LKB1-p53-p21/WAF1途径在富氢置换液CRRT治疗急性肾脏缺血/再灌注损伤中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

量子Ising模型中Kibble-Zurek机制的量子模拟实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于原子系综的低噪声量子精密测量

国家自然科学基金

0+阅读 · 2014年12月31日

面向森林监护的敏捷高光谱影像获取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

APE1线粒体调控ROS介导骨肉瘤化疗耐药的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于里德堡原子偶极封锁效应的量子相干操控

国家自然科学基金

0+阅读 · 2012年12月31日

多比特量子门的实现和量子信息处理

国家自然科学基金

0+阅读 · 2011年12月31日

ROS介导的APE-1和PI3K/Akt信号通路对H.pylori诱导胃上皮细胞凋亡增殖的作用

国家自然科学基金

0+阅读 · 2011年12月31日

晶体塑性模型的改进及其在微成形工艺研究中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员