学习自然语言反馈 (Learning from Natural Language Feedback) - 专知论文

会员服务 ·

0

语言模型化 · 学成 · MoDELS · INFORMS · Performer ·

2022 年 4 月 29 日

Learning from Natural Language Feedback

翻译：学习自然语言反馈

Jérémy Scheurer,Jon Ander Campos,Jun Shern Chan,Angelica Chen,Kyunghyun Cho,Ethan Perez

from arxiv, The First Workshop on Learning with Natural Language Supervision at ACL 2022

Pretrained language models often do not perform tasks in ways that are in line with our preferences, e.g., generating offensive text or factually incorrect summaries. Recent work approaches the above issue by learning from a simple form of human evaluation: comparisons between pairs of model-generated task outputs. Comparison feedback conveys limited information about human preferences per human evaluation. Here, we propose to learn from natural language feedback, which conveys more information per human evaluation. We learn from language feedback on model outputs using a three-step learning algorithm. First, we condition the language model on the initial output and feedback to generate many refinements. Second, we choose the refinement with the highest similarity to the feedback. Third, we finetune a language model to maximize the likelihood of the chosen refinement given the input. In synthetic experiments, we first evaluate whether language models accurately incorporate feedback to produce refinements, finding that only large language models (175B parameters) do so. Using only 100 samples of human-written feedback, our learning algorithm finetunes a GPT-3 model to roughly human-level summarization.

翻译：受过训练的语言模式往往不以符合我们偏好的方式执行任务,例如,产生冒犯性文本或事实不正确的摘要。最近的工作通过学习简单的人类评价形式来处理上述问题:对模型产生的任务产出进行对比;比较反馈传达的关于人类对人的偏好的信息有限;在这里,我们建议学习自然语言反馈,这种反馈能传达更多的人类评价信息;我们用三步学习算法从语言对模型产出的反馈中学习。首先,我们将语言模式以初始输出和反馈作为条件,以产生许多改进。第二,我们选择与反馈最相似的精细。第三,我们微调一种语言模式,以尽量扩大所选择的精细化可能性,因为投入。在合成实验中,我们首先评估语言模式是否准确地纳入反馈以产生改进,发现只有大型语言模型(175B参数)这样做。我们学习算法仅使用100个人类写反馈样本,将GPT-3模型精细化为人类层面的总结。

0

相关内容

语言模型化

语言模型化

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用趋化细菌修复非均质地下水系统中石油污染的机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

公路隧道中汽车尾气污染物MOFs基催化剂和吸附剂的研制

国家自然科学基金

0+阅读 · 2013年12月31日

撞击流对喷雾燃烧过程的强化机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

中性有机铜(I)金属配合物磷光材料的合成、表征及光电性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体强化多孔介质燃烧降解有机废气的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

RICCI流的整体解和收敛性

国家自然科学基金

0+阅读 · 2012年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

基于有机-无机杂化膜特性调控的微米/纳米复合粒子界面聚并机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

气流脉动与流体动力性噪声诱发机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Interaction-Grounded Learning with Action-inclusive Feedback

Interaction-Grounded Learning with Action-inclusive Feedback

Arxiv

0+阅读 · 2022年6月16日

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

Arxiv

0+阅读 · 2022年6月16日

The Dual PC Algorithm for Structure Learning

Arxiv

0+阅读 · 2022年6月15日

Future Internet Congestion Control: The Diminishing Feedback Problem

Arxiv

0+阅读 · 2022年6月14日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】移动计算摄影的神经场表示

大语言模型遇见法律人工智能：综述

【ICCV2025】InfGen：一种分辨率无关的可扩展图像合成范式

美军用无人地面战车发展：现代战争中超越弹药的多元应用

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Interaction-Grounded Learning with Action-inclusive Feedback

Interaction-Grounded Learning with Action-inclusive Feedback

Arxiv

0+阅读 · 2022年6月16日

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

Arxiv

0+阅读 · 2022年6月16日

The Dual PC Algorithm for Structure Learning

Arxiv

0+阅读 · 2022年6月15日

Future Internet Congestion Control: The Diminishing Feedback Problem

Arxiv

0+阅读 · 2022年6月14日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

相关基金

利用趋化细菌修复非均质地下水系统中石油污染的机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

公路隧道中汽车尾气污染物MOFs基催化剂和吸附剂的研制

国家自然科学基金

0+阅读 · 2013年12月31日

撞击流对喷雾燃烧过程的强化机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

中性有机铜(I)金属配合物磷光材料的合成、表征及光电性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体强化多孔介质燃烧降解有机废气的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

RICCI流的整体解和收敛性

国家自然科学基金

0+阅读 · 2012年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

基于有机-无机杂化膜特性调控的微米/纳米复合粒子界面聚并机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

气流脉动与流体动力性噪声诱发机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员