高效率的中级任务甄选 (What to Pre-Train on? Efficient Intermediate Task Selection) - 专知论文

会员服务 ·

0

可辨认的 · 数据集 · 情景 · Less · 小样本学习 ·

2021 年 4 月 16 日

What to Pre-Train on? Efficient Intermediate Task Selection

翻译：高效率的中级任务甄选

Clifton Poth,Jonas Pfeiffer,Andreas Rücklé,Iryna Gurevych

Intermediate task fine-tuning has been shown to culminate in large transfer gains across many NLP tasks. With an abundance of candidate datasets as well as pre-trained language models, it has become infeasible to run the cross-product of all combinations to find the best transfer setting. In this work we first establish that similar sequential fine-tuning gains can be achieved in adapter settings, and subsequently consolidate previously proposed methods that efficiently identify beneficial tasks for intermediate transfer learning. We experiment with a diverse set of 42 intermediate and 11 target English classification, multiple choice, question answering, and sequence tagging tasks. Our results show that efficient embedding based methods that rely solely on the respective datasets outperform computational expensive few-shot fine-tuning approaches. Our best methods achieve an average Regret@3 of less than 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.

翻译：中期任务微调被证明最终导致在很多国家劳工规划任务中大量转移收益。大量候选数据集以及经过预先培训的语言模型已经无法运行所有组合的交叉产品以找到最佳转移环境。在这项工作中,我们首先确定在适应器环境中可以实现类似的顺序微调收益,并随后整合先前提出的有效确定中间转移学习有益任务的方法。我们试验了一套多样的、42个中间和11个目标的英文分类、多种选择、问答和顺序标记任务。我们的结果显示,有效的嵌入基于方法完全依靠各自的数据集,超越了成本昂贵的微调方法。我们的最佳方法在所有目标任务中都实现了平均不超过1%的Regret@3, 表明我们能够有效地确定中间培训的最佳数据集。

0

相关内容

可辨认的

【DeepMind教程】蒙特卡罗树搜索，60页ppt

专知会员服务

55+阅读 · 2021年4月7日

NLP必读经典文献100篇

专知会员服务

123+阅读 · 2020年9月8日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

90+阅读 · 2020年7月4日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

61+阅读 · 2020年2月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

18+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

Arxiv

0+阅读 · 2021年6月9日

Learning Pseudo-Backdoors for Mixed Integer Programs

Arxiv

0+阅读 · 2021年6月9日

Efficient Lottery Ticket Finding: Less Data is More

Arxiv

0+阅读 · 2021年6月6日

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Arxiv

0+阅读 · 2021年6月6日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Adaptive Methods for Real-World Domain Generalization

Arxiv

6+阅读 · 2021年3月30日

LogME: Practical Assessment of Pre-trained Models for Transfer Learning

Arxiv

4+阅读 · 2021年2月22日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Multi-task learning to improve natural language understanding

Arxiv

4+阅读 · 2018年12月17日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

【DeepMind教程】蒙特卡罗树搜索，60页ppt

专知会员服务

55+阅读 · 2021年4月7日

NLP必读经典文献100篇

专知会员服务

123+阅读 · 2020年9月8日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

90+阅读 · 2020年7月4日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

61+阅读 · 2020年2月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

18+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

热门VIP内容

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

Arxiv

0+阅读 · 2021年6月9日

Learning Pseudo-Backdoors for Mixed Integer Programs

Arxiv

0+阅读 · 2021年6月9日

Efficient Lottery Ticket Finding: Less Data is More

Arxiv

0+阅读 · 2021年6月6日

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Arxiv

0+阅读 · 2021年6月6日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Adaptive Methods for Real-World Domain Generalization

Arxiv

6+阅读 · 2021年3月30日

LogME: Practical Assessment of Pre-trained Models for Transfer Learning

Arxiv

4+阅读 · 2021年2月22日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Multi-task learning to improve natural language understanding

Arxiv

4+阅读 · 2018年12月17日

微信扫码咨询专知VIP会员