用于受控的制导文本生成的临界引导代碼 (Critic-Guided Decoding for Controlled Text Generation) - 专知论文

会员服务 ·

0

控制器 · 解码 · 评论员 · Weight · 语言模型化 ·

2022 年 12 月 21 日

Critic-Guided Decoding for Controlled Text Generation

翻译：用于受控的制导文本生成的临界引导代碼

Minbeom Kim,Hwanhee Lee,Kang Min Yoo,Joonsuk Park,Hwaran Lee,Kyomin Jung

from arxiv, 11 pages, 6 figures

Steering language generation towards objectives or away from undesired content has been a long-standing goal in utilizing language models (LM). Recent work has demonstrated reinforcement learning and weighted decoding as effective approaches to achieve a higher level of language control and quality with pros and cons. In this work, we propose a novel critic decoding method for controlled language generation (CriticControl) that combines the strengths of reinforcement learning and weighted decoding. Specifically, we adopt the actor-critic framework to train an LM-steering critic from non-differentiable reward models. And similar to weighted decoding, our method freezes the language model and manipulates the output token distribution using called critic, improving training efficiency and stability. Evaluation of our method on three controlled generation tasks, namely topic control, sentiment control, and detoxification, shows that our approach generates more coherent and well-controlled texts than previous methods. In addition, CriticControl demonstrates superior generalization ability in zero-shot settings. Human evaluation studies also corroborate our findings.

翻译：在使用语言模式(LM)方面,一个长期的目标就是为实现目标或远离不受欢迎的内容而创造指导语言。最近的工作表明,强化学习和加权解码是提高语言控制水平、提高优缺点和质量的有效方法。在这项工作中,我们提议为有控制的语言生成(Crital Control)提出一个新的批评解码方法,将强化学习和加权解码的优点结合起来。具体地说,我们采用行为体-批评框架,从不可区分的奖励模式中培训一名LM-Steerger评论家。类似加权解码,我们的方法冻结了语言模式,并用所谓的批评者、提高培训效率和稳定性来操纵输出符号分发。对我们三种有控制的生成任务(即主题控制、情绪控制和解毒)方法的评价表明,我们的方法比以往方法更一致、更受控制的文本。此外,Critic control展示了在零发效果环境中的高级通用能力。人类评价研究也证实了我们的调查结果。

0

相关内容

控制器

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

新型靶向纳米颗粒介导CRAd治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于融合智能算法斜拉桥振动控制Benchmark问题的混合控制策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于新型零阶谐振器的低剖面全向圆极化天线及宽频技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

变形液滴萃取传质中Marangoni效应的三维数值模拟和实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

阿尔茨海默病郎飞结区Aβ生成及调控的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用于阿尔茨海默病早期诊断的血浆中Abeta42/Abeta40比值的定量分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

可示踪双基因修饰骨髓间充质干细胞对大鼠移植肝脏保护效应的研究

国家自然科学基金

0+阅读 · 2009年12月31日

调节性T细胞中FOXP3蛋白修饰及转录复合体动态组装的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

干预periostin表达对瘢痕疙瘩和正常皮肤成纤维细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Arxiv

0+阅读 · 2023年2月21日

Robust Meta Learning for Image based tasks

Arxiv

0+阅读 · 2023年2月21日

Impact of visual assistance for automated audio captioning

Arxiv

0+阅读 · 2023年2月20日

Query Performance Prediction for Neural IR: Are We There Yet?

Arxiv

0+阅读 · 2023年2月20日

Improving User Controlled Table-To-Text Generation Robustness

Arxiv

0+阅读 · 2023年2月20日

Affect-Conditioned Image Generation

Arxiv

0+阅读 · 2023年2月20日

Pretraining Language Models with Human Preferences

Arxiv

0+阅读 · 2023年2月16日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《超视距空战强化学习智能体的深度学习表征能力评估》最新70页

《第一人称视角无人机革命及其对陆战与其它战争维度的影响》最新19页报告

从兵棋推演到真实战场：人工智能指挥官在实战中的崛起

《小型无人机系统空域管理与控制：美陆军指挥官手册》最新34页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Arxiv

0+阅读 · 2023年2月21日

Robust Meta Learning for Image based tasks

Arxiv

0+阅读 · 2023年2月21日

Impact of visual assistance for automated audio captioning

Arxiv

0+阅读 · 2023年2月20日

Query Performance Prediction for Neural IR: Are We There Yet?

Arxiv

0+阅读 · 2023年2月20日

Improving User Controlled Table-To-Text Generation Robustness

Arxiv

0+阅读 · 2023年2月20日

Affect-Conditioned Image Generation

Arxiv

0+阅读 · 2023年2月20日

Pretraining Language Models with Human Preferences

Arxiv

0+阅读 · 2023年2月16日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

相关基金

新型靶向纳米颗粒介导CRAd治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于融合智能算法斜拉桥振动控制Benchmark问题的混合控制策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于新型零阶谐振器的低剖面全向圆极化天线及宽频技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

变形液滴萃取传质中Marangoni效应的三维数值模拟和实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

阿尔茨海默病郎飞结区Aβ生成及调控的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用于阿尔茨海默病早期诊断的血浆中Abeta42/Abeta40比值的定量分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

可示踪双基因修饰骨髓间充质干细胞对大鼠移植肝脏保护效应的研究

国家自然科学基金

0+阅读 · 2009年12月31日

调节性T细胞中FOXP3蛋白修饰及转录复合体动态组装的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

干预periostin表达对瘢痕疙瘩和正常皮肤成纤维细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员