带有逐项协议的神经机器翻译动态图层聚合 (Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement) - 专知论文

会员服务 ·

0

层 · Machine Translation · MoDELS · Networks · INFORMS ·

2019 年 2 月 15 日

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement

翻译：带有逐项协议的神经机器翻译动态图层聚合

Zi-Yi Dou,Zhaopeng Tu,Xing Wang,Longyue Wang,Shuming Shi,Tong Zhang

from arxiv, AAAI 2019

With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation. However, most of the previous methods combine layers in a static fashion in that their aggregation strategy is independent of specific hidden states. Inspired by recent progress on capsule networks, in this paper we propose to use routing-by-agreement strategies to aggregate layers dynamically. Specifically, the algorithm learns the probability of a part (individual layer representations) assigned to a whole (aggregated representations) in an iterative way and combines parts accordingly. We implement our algorithm on top of the state-of-the-art neural machine translation model TRANSFORMER and conduct experiments on the widely-used WMT14 English-German and WMT17 Chinese-English translation datasets. Experimental results across language pairs show that the proposed approach consistently outperforms the strong baseline model and a representative static aggregation model.

翻译：随着深层神经网络的可喜进展,层集已用于整合各个领域(如计算机视觉和机器翻译)的跨层信息,但以往方法大多以静态方式将层层组合在一起,因为其集成战略独立于具体的隐藏状态。受胶囊网络最近进展的启发,我们在本文件中提议使用逐条路线战略动态地汇总层。具体地说,算法以迭接方式了解分配给整个层(个人层表示)的部分(个人层表示)的概率,并相应组合部分。我们在最先进的神经机转换模型TRANSORMEAR上采用我们的算法,并在广泛使用的WMT14英语-德语和WMT17中文-英语翻译数据集上进行实验。各语言配对的实验结果显示,拟议的方法始终比强的基线模型和具有代表性的静态汇总模型要强。

0

相关内容

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

164+阅读 · 2020年5月10日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

29+阅读 · 2020年4月15日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

113+阅读 · 2020年3月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

234+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

30+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

15+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

「知识表示学习」专题论文推荐 | 每周论文清单

「知识表示学习」专题论文推荐 | 每周论文清单

PaperWeekly

18+阅读 · 2018年2月5日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

10+阅读 · 2015年7月1日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

18+阅读 · 2019年11月20日

Multi-Head Attention with Disagreement Regularization

Arxiv

9+阅读 · 2018年10月24日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Doubly Attentive Transformer Machine Translation

Doubly Attentive Transformer Machine Translation

Arxiv

4+阅读 · 2018年7月30日

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

Arxiv

3+阅读 · 2018年6月12日

Japanese Predicate Conjugation for Neural Machine Translation

Arxiv

3+阅读 · 2018年5月25日

Handling Homographs in Neural Machine Translation

Arxiv

3+阅读 · 2018年3月28日

Self-Attentive Residual Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年3月22日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

164+阅读 · 2020年5月10日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

29+阅读 · 2020年4月15日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

113+阅读 · 2020年3月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

234+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

30+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

热门VIP内容

相关资讯

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

15+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

「知识表示学习」专题论文推荐 | 每周论文清单

「知识表示学习」专题论文推荐 | 每周论文清单

PaperWeekly

18+阅读 · 2018年2月5日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

10+阅读 · 2015年7月1日

相关论文

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

18+阅读 · 2019年11月20日

Multi-Head Attention with Disagreement Regularization

Arxiv

9+阅读 · 2018年10月24日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Doubly Attentive Transformer Machine Translation

Doubly Attentive Transformer Machine Translation

Arxiv

4+阅读 · 2018年7月30日

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

Arxiv

3+阅读 · 2018年6月12日

Japanese Predicate Conjugation for Neural Machine Translation

Arxiv

3+阅读 · 2018年5月25日

Handling Homographs in Neural Machine Translation

Arxiv

3+阅读 · 2018年3月28日

Self-Attentive Residual Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年3月22日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员