学习非自动后退变异器 (On the Learning of Non-Autoregressive Transformers) - 专知论文

会员服务 ·

0

Learning · 可约的 · INFORMS · 似然 · 变换 ·

2022 年 6 月 13 日

On the Learning of Non-Autoregressive Transformers

翻译：学习非自动后退变异器

Fei Huang,Tianhua Tao,Hao Zhou,Lei Li,Minlie Huang

from arxiv, accepted at ICML2022

Non-autoregressive Transformer (NAT) is a family of text generation models, which aims to reduce the decoding latency by predicting the whole sentences in parallel. However, such latency reduction sacrifices the ability to capture left-to-right dependencies, thereby making NAT learning very challenging. In this paper, we present theoretical and empirical analyses to reveal the challenges of NAT learning and propose a unified perspective to understand existing successes. First, we show that simply training NAT by maximizing the likelihood can lead to an approximation of marginal distributions but drops all dependencies between tokens, where the dropped information can be measured by the dataset's conditional total correlation. Second, we formalize many previous objectives in a unified framework and show that their success can be concluded as maximizing the likelihood on a proxy distribution, leading to a reduced information loss. Empirical studies show that our perspective can explain the phenomena in NAT learning and guide the design of new training methods.

翻译：非偏向式变换器(NAT)是一个文本生成模型的组合,目的是通过同时预测整个句子来减少解码延迟,然而,这种延缓式降低会牺牲捕捉左对右依赖性的能力,从而使NAT学习非常富有挑战性。在本文中,我们提出理论和经验分析,以揭示NAT学习的挑战,并提出统一的观点来理解现有的成功。首先,我们表明,仅仅通过尽可能扩大可能性来培训NAT,就会导致边际分布接近,但会降低所有代号之间的依赖性,而下降的信息可以用数据集的有条件总关联度来衡量。第二,我们把先前的许多目标正式化为统一框架,并表明其成功可以作为在代理分布上实现最大可能性,导致减少信息损失。经验性研究表明,我们的观点可以解释NAT学习中的现象,并指导新培训方法的设计。

1

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

H19/HOTS天然正反义转录本的交互竞争调控滋养细胞功能的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白修饰酶SETD2功能缺失促进MLL白血病发生的表观遗传调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA ecG0S2-G0S2甲基化调控途径在重症肌无力T淋巴细胞免疫平衡中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组合学和概率论中的单峰型问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

茧蜂病毒调控寄主粒细胞hemichannel与凋亡相互关系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向大规模数据的机器学习算法研究

国家自然科学基金

9+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

新环境适应的海马突触可塑性机制

国家自然科学基金

0+阅读 · 2010年12月31日

肝细胞癌血管生成拟态的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

葡萄叶片和果实白藜芦醇合成、转化的相互影响及其酶学和分子机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

On the Detection of Adaptive Adversarial Attacks in Speaker Verification Systems

Arxiv

0+阅读 · 2022年8月1日

On the impact of serial dependence on penalized regression methods

Arxiv

0+阅读 · 2022年8月1日

The Effect of Omitted Variables on the Sign of Regression Coefficients

Arxiv

0+阅读 · 2022年8月1日

Lifelong Ensemble Learning based on Multiple Representations for Few-Shot Object Recognition

Arxiv

0+阅读 · 2022年7月31日

Mixture model for designs in high dimensional regression and the LASSO

Arxiv

0+阅读 · 2022年7月30日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

On the Detection of Adaptive Adversarial Attacks in Speaker Verification Systems

Arxiv

0+阅读 · 2022年8月1日

On the impact of serial dependence on penalized regression methods

Arxiv

0+阅读 · 2022年8月1日

The Effect of Omitted Variables on the Sign of Regression Coefficients

Arxiv

0+阅读 · 2022年8月1日

Lifelong Ensemble Learning based on Multiple Representations for Few-Shot Object Recognition

Arxiv

0+阅读 · 2022年7月31日

Mixture model for designs in high dimensional regression and the LASSO

Arxiv

0+阅读 · 2022年7月30日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

相关基金

H19/HOTS天然正反义转录本的交互竞争调控滋养细胞功能的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白修饰酶SETD2功能缺失促进MLL白血病发生的表观遗传调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA ecG0S2-G0S2甲基化调控途径在重症肌无力T淋巴细胞免疫平衡中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组合学和概率论中的单峰型问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

茧蜂病毒调控寄主粒细胞hemichannel与凋亡相互关系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向大规模数据的机器学习算法研究

国家自然科学基金

9+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

新环境适应的海马突触可塑性机制

国家自然科学基金

0+阅读 · 2010年12月31日

肝细胞癌血管生成拟态的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

葡萄叶片和果实白藜芦醇合成、转化的相互影响及其酶学和分子机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员