【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM - 专知VIP

会员服务 ·

0

预训练语言模型 · 可迁移性 ·

2020 年 5 月 3 日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

专知，提供专业可信的知识分发服务，让认知协作更快更好！

虽然探测是在预训练模型表示中识别知识的一种常见技术，但是尚不清楚这种技术是否能够解释像BERT这样在finetuning中端到端训练的模型的下游成功。为了解决这个问题，我们将探测与一种不同的可转移性度量进行比较:部分重新初始化的模型的微调性能的下降。该技术表明，在BERT中，对下游粘合任务具有高探测精度的层对这些任务的高精度来说既不是必要的，也不是充分的。此外，数据集的大小影响层的可移植性:一个人拥有的精细数据越少，BERT的中间层和后中间层就越重要。此外，BERT并没有简单地为各个层找到更好的初始化器;相反，层次之间的相互作用很重要，在细化之前重新排序BERT的层次会极大地损害评估指标。这些结果提供了一种理解参数在预训练语言模型中的可转移性的方法，揭示了这些模型中转移学习的流动性和复杂性。

成为VIP会员查看完整内容

20

相关内容

预训练语言模型

预训练语言模型

近年来，预训练模型（例如ELMo、GPT、BERT和XLNet等）的快速发展大幅提升了诸多NLP任务的整体水平，同时也使得很多应用场景进入到实际落地阶段。预训练语言模型本身就是神经网络语言模型，它的特点包括：第一，可以使用大规模无标注纯文本语料进行训练；第二，可以用于各类下游NLP任务，不是针对某项定制的，但以后可用在下游NIP任务上，你不需要为下游任务专门设计一种神经网络，或者提供一种结构，直接在几种给定的固定框架中选择一种进行 fine-tune，就可以从而得到很好的结果。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【SIGIR2020-斯坦福大学】一种新的又好又快的BERT类信息检索模型-ColBERT

【SIGIR2020-斯坦福大学】一种新的又好又快的BERT类信息检索模型-ColBERT

专知会员服务

44+阅读 · 2020年4月28日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

专知会员服务

30+阅读 · 2020年4月19日

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

专知会员服务

26+阅读 · 2020年3月19日

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

专知会员服务

38+阅读 · 2020年2月25日

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

专知会员服务

10+阅读 · 2020年1月9日

【AAAI2020-中山大学】知识图谱迁移网络小样本识别，Knowledge Graph Transfer Network for Few-Shot Recognition(附pdf）

【AAAI2020-中山大学】知识图谱迁移网络小样本识别，Knowledge Graph Transfer Network for Few-Shot Recognition(附pdf）

专知会员服务

102+阅读 · 2019年11月24日

【NLP| 推荐文章】从统一文本到文本探讨迁移学习的局限性（Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer）

【NLP| 推荐文章】从统一文本到文本探讨迁移学习的局限性（Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer）

专知会员服务

20+阅读 · 2019年11月24日

ACL 2019 | 多语言BERT的语言表征探索

ACL 2019 | 多语言BERT的语言表征探索

AI科技评论

21+阅读 · 2019年9月6日

Bert 之后：预训练语言模型与自然语言生成

Bert 之后：预训练语言模型与自然语言生成

AINLP

15+阅读 · 2019年7月16日

谷歌更强 NLP 模型 XLNet 开源：20 项任务全面碾压 BERT！

谷歌更强 NLP 模型 XLNet 开源：20 项任务全面碾压 BERT！

雷锋网

5+阅读 · 2019年6月20日

开发 | 谷歌更强NLP模型XLNet开源：20项任务全面碾压BERT！

开发 | 谷歌更强NLP模型XLNet开源：20项任务全面碾压BERT！

AI科技评论

6+阅读 · 2019年6月20日

中文版-BERT-预训练的深度双向Transformer语言模型-详细介绍

中文版-BERT-预训练的深度双向Transformer语言模型-详细介绍

深度学习与NLP

30+阅读 · 2019年3月30日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

新智元

23+阅读 · 2019年2月27日

深度思考 | 从BERT看大规模数据的无监督利用

深度思考 | 从BERT看大规模数据的无监督利用

PaperWeekly

11+阅读 · 2019年2月18日

ELMo的朋友圈：预训练语言模型真的一枝独秀吗？

ELMo的朋友圈：预训练语言模型真的一枝独秀吗？

机器之心

10+阅读 · 2019年1月1日

图解当前最强语言模型BERT：NLP是如何攻克迁移学习的？

图解当前最强语言模型BERT：NLP是如何攻克迁移学习的？

机器之心

4+阅读 · 2018年12月13日

详细解读谷歌新模型 BERT 为什么嗨翻 AI 圈

详细解读谷歌新模型 BERT 为什么嗨翻 AI 圈

人工智能头条

10+阅读 · 2018年10月25日

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Span-based Joint Entity and Relation Extraction with Transformer Pre-training

Arxiv

7+阅读 · 2019年9月17日

Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning

Arxiv

5+阅读 · 2019年9月5日

Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation

Arxiv

3+阅读 · 2019年8月22日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Arxiv

3+阅读 · 2019年5月5日

Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia

Arxiv

3+阅读 · 2018年5月15日

Supervised and Unsupervised Transfer Learning for Question Answering

Arxiv

4+阅读 · 2018年4月21日

VIP会员

相关主题

预训练语言模型

相关VIP内容

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【SIGIR2020-斯坦福大学】一种新的又好又快的BERT类信息检索模型-ColBERT

【SIGIR2020-斯坦福大学】一种新的又好又快的BERT类信息检索模型-ColBERT

专知会员服务

44+阅读 · 2020年4月28日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

专知会员服务

30+阅读 · 2020年4月19日

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

专知会员服务

26+阅读 · 2020年3月19日

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

专知会员服务

38+阅读 · 2020年2月25日

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

专知会员服务

10+阅读 · 2020年1月9日

【AAAI2020-中山大学】知识图谱迁移网络小样本识别，Knowledge Graph Transfer Network for Few-Shot Recognition(附pdf）

【AAAI2020-中山大学】知识图谱迁移网络小样本识别，Knowledge Graph Transfer Network for Few-Shot Recognition(附pdf）

专知会员服务

102+阅读 · 2019年11月24日

【NLP| 推荐文章】从统一文本到文本探讨迁移学习的局限性（Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer）

【NLP| 推荐文章】从统一文本到文本探讨迁移学习的局限性（Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer）

专知会员服务

20+阅读 · 2019年11月24日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

ACL 2019 | 多语言BERT的语言表征探索

ACL 2019 | 多语言BERT的语言表征探索

AI科技评论

21+阅读 · 2019年9月6日

Bert 之后：预训练语言模型与自然语言生成

Bert 之后：预训练语言模型与自然语言生成

AINLP

15+阅读 · 2019年7月16日

谷歌更强 NLP 模型 XLNet 开源：20 项任务全面碾压 BERT！

谷歌更强 NLP 模型 XLNet 开源：20 项任务全面碾压 BERT！

雷锋网

5+阅读 · 2019年6月20日

开发 | 谷歌更强NLP模型XLNet开源：20项任务全面碾压BERT！

开发 | 谷歌更强NLP模型XLNet开源：20项任务全面碾压BERT！

AI科技评论

6+阅读 · 2019年6月20日

中文版-BERT-预训练的深度双向Transformer语言模型-详细介绍

中文版-BERT-预训练的深度双向Transformer语言模型-详细介绍

深度学习与NLP

30+阅读 · 2019年3月30日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

新智元

23+阅读 · 2019年2月27日

深度思考 | 从BERT看大规模数据的无监督利用

深度思考 | 从BERT看大规模数据的无监督利用

PaperWeekly

11+阅读 · 2019年2月18日

ELMo的朋友圈：预训练语言模型真的一枝独秀吗？

ELMo的朋友圈：预训练语言模型真的一枝独秀吗？

机器之心

10+阅读 · 2019年1月1日

图解当前最强语言模型BERT：NLP是如何攻克迁移学习的？

图解当前最强语言模型BERT：NLP是如何攻克迁移学习的？

机器之心

4+阅读 · 2018年12月13日

详细解读谷歌新模型 BERT 为什么嗨翻 AI 圈

详细解读谷歌新模型 BERT 为什么嗨翻 AI 圈

人工智能头条

10+阅读 · 2018年10月25日

相关论文

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Span-based Joint Entity and Relation Extraction with Transformer Pre-training

Arxiv

7+阅读 · 2019年9月17日

Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning

Arxiv

5+阅读 · 2019年9月5日

Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation

Arxiv

3+阅读 · 2019年8月22日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Arxiv

3+阅读 · 2019年5月5日

Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia

Arxiv

3+阅读 · 2018年5月15日

Supervised and Unsupervised Transfer Learning for Question Answering

Arxiv

4+阅读 · 2018年4月21日

微信扫码咨询专知VIP会员