Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning - 专知论文

会员服务 ·

0

小样本学习 · Learning · 语言模型化 · tuning · Better ·

2023 年 5 月 12 日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

翻译：暂无翻译

Yu Meng,Martin Michalski,Jiaxin Huang,Yu Zhang,Tarek Abdelzaher,Jiawei Han

from arxiv, ICML 2023. (Code: https://github.com/yumeng5/FewGen)

Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label-discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.

翻译：暂无翻译

5

相关内容

小样本学习

小样本学习

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

生物响应、多功能脂质-聚合物杂化纳米药物输送系统研究

国家自然科学基金

0+阅读 · 2014年12月31日

(Ba,Ca)(Ti,Sn)O3多元体系无铅压电陶瓷的相结构与性能调控研究

国家自然科学基金

0+阅读 · 2014年12月31日

PKCα与UNC5B相互作用调控膀胱癌细胞药物敏感性的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

视网膜年龄相关性黄斑病变OCT图像的三维分割算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

压电微动系统多尺度耦合迟滞动力学效应的宽频域辨识与控制

国家自然科学基金

0+阅读 · 2013年12月31日

海洋中尺度涡旋动力结构与维持机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

煤粉炉增钙脱硫粉煤灰Q相-3CaO？3Al2O3？CaSO4系列矿物生成机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

薛定愕型方程的两网格有限元解法

国家自然科学基金

0+阅读 · 2009年12月31日

一种多旋翼多功能空中机器人的设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias

Arxiv

0+阅读 · 2023年6月28日

Learning from Invalid Data: On Constraint Satisfaction in Generative Models

Arxiv

0+阅读 · 2023年6月27日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

VIP会员

文章信息

相关主题

小样本学习

语言模型化

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias

Arxiv

0+阅读 · 2023年6月28日

Learning from Invalid Data: On Constraint Satisfaction in Generative Models

Arxiv

0+阅读 · 2023年6月27日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

相关基金

生物响应、多功能脂质-聚合物杂化纳米药物输送系统研究

国家自然科学基金

0+阅读 · 2014年12月31日

(Ba,Ca)(Ti,Sn)O3多元体系无铅压电陶瓷的相结构与性能调控研究

国家自然科学基金

0+阅读 · 2014年12月31日

PKCα与UNC5B相互作用调控膀胱癌细胞药物敏感性的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

视网膜年龄相关性黄斑病变OCT图像的三维分割算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

压电微动系统多尺度耦合迟滞动力学效应的宽频域辨识与控制

国家自然科学基金

0+阅读 · 2013年12月31日

海洋中尺度涡旋动力结构与维持机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

煤粉炉增钙脱硫粉煤灰Q相-3CaO？3Al2O3？CaSO4系列矿物生成机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

薛定愕型方程的两网格有限元解法

国家自然科学基金

0+阅读 · 2009年12月31日

一种多旋翼多功能空中机器人的设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员