多语言神经机器翻译的Pareto前沿分析 (On the Pareto Front of Multilingual Neural Machine Translation) - 专知论文

会员服务 ·

0

多语言神经机器翻译 · 神经机器翻译 · 幂律 · 机器翻译 · 数据不平衡 ·

2023 年 4 月 7 日

On the Pareto Front of Multilingual Neural Machine Translation

翻译：多语言神经机器翻译的Pareto前沿分析

Liang Chen,Shuming Ma,Dongdong Zhang,Furu Wei,Baobao Chang

from arxiv, 14 pages, 6 figures, code released at https://github.com/chenllliang/ParetoMNMT

In this work, we study how the generalization performance of a given direction changes with its sampling ratio in Multilingual Neural Machine Translation (MNMT). By training over 200 multilingual models with various model sizes, directions, and total numbers of tasks, we find that scalarization leads to a multitask trade-off front that deviates from the traditional Pareto front when there exists data imbalance in the training corpus. That is, the performance of certain translation directions does not improve with the increase of its weight in the multi-task optimization objective, which poses a great challenge to improve the overall performance of all directions. Based on our observations, we propose the Double Power Law to predict the unique performance trade-off front in MNMT, which is robust across various languages, data adequacy, and the number of tasks. Finally, we formulate the sample ratio selection problem in MNMT as an optimization problem based on the Double Power Law, which achieves better performance than temperature searching and gradient manipulation methods using up to half of the total training budget in our experiments.

翻译：在这项工作中，我们研究了一个给定方向的泛化性能如何随其采样比率而变化，在多语言神经机器翻译（MNMT）中。通过训练200多个不同模型大小、方向和任务总数的多语言模型，我们发现当训练数据不平衡时，标量化引导了一个任务之间多任务折衷前沿，它偏离了传统的Pareto前沿。也就是说，某些翻译方向的性能不随其在多任务优化目标函数中的权重增加而提高，这对于提高所有方向的整体性能带来了巨大的挑战。基于我们的观察，我们提出了“双重幂律”来预测MNMT中的独特性能折衷前沿，它在各种语言、数据充足性和任务数量中都很稳健。最后，我们将MNMT中的采样比率选择问题建立为基于双重幂律的优化问题，在我们的实验中，这比温度搜索和梯度调整方法使用半数的全部训练预算实现了更好的性能。

0

相关内容

多语言神经机器翻译

多语言神经机器翻译

多语言机器翻译使用一个翻译模型来处理多种语言。

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

谷歌教你学 AI -机器学习的7步骤

谷歌教你学 AI -机器学习的7步骤

专知会员服务

28+阅读 · 2022年3月13日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

单原子催化剂M1/TiO2(M=Cu,Ag,Au,Pt,Pd,Ir)催化甲胺裂解的理论研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于结构化压缩感知的穿墙雷达成像技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

Smurf1介导的RhoB泛素化降解在细胞凋亡和肿瘤发生发展中的作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

超大功率多行星轮柔性销轴风电齿轮箱均载机理及构型研究

国家自然科学基金

0+阅读 · 2013年12月31日

血清miR-696在肌肉-肝脏对话调节代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Maspin与NF-κB家族成员在前列腺癌中分子调控与作用机制的相关性研究

国家自然科学基金

0+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

丛枝菌根真菌与宿主植物相互作用的相关基因研究

国家自然科学基金

0+阅读 · 2008年12月31日

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

Arxiv

0+阅读 · 2023年5月25日

Extracting Text Representations for Terms and Phrases in Technical Domains

Arxiv

0+阅读 · 2023年5月25日

Towards Higher Pareto Frontier in Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月25日

Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics

Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics

Arxiv

0+阅读 · 2023年5月23日

Accessing Higher Dimensions for Unsupervised Word Translation

Arxiv

0+阅读 · 2023年5月23日

Impact of Colour Variation on Robustness of Deep Neural Networks

Arxiv

0+阅读 · 2023年5月23日

One-stop Training of Multiple Capacity Models for Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月23日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

多语言神经机器翻译

神经机器翻译

数据不平衡

相关VIP内容

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

谷歌教你学 AI -机器学习的7步骤

谷歌教你学 AI -机器学习的7步骤

专知会员服务

28+阅读 · 2022年3月13日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

Arxiv

0+阅读 · 2023年5月25日

Extracting Text Representations for Terms and Phrases in Technical Domains

Arxiv

0+阅读 · 2023年5月25日

Towards Higher Pareto Frontier in Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月25日

Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics

Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics

Arxiv

0+阅读 · 2023年5月23日

Accessing Higher Dimensions for Unsupervised Word Translation

Arxiv

0+阅读 · 2023年5月23日

Impact of Colour Variation on Robustness of Deep Neural Networks

Arxiv

0+阅读 · 2023年5月23日

One-stop Training of Multiple Capacity Models for Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月23日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

相关基金

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

单原子催化剂M1/TiO2(M=Cu,Ag,Au,Pt,Pd,Ir)催化甲胺裂解的理论研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于结构化压缩感知的穿墙雷达成像技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

Smurf1介导的RhoB泛素化降解在细胞凋亡和肿瘤发生发展中的作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

超大功率多行星轮柔性销轴风电齿轮箱均载机理及构型研究

国家自然科学基金

0+阅读 · 2013年12月31日

血清miR-696在肌肉-肝脏对话调节代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Maspin与NF-κB家族成员在前列腺癌中分子调控与作用机制的相关性研究

国家自然科学基金

0+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

丛枝菌根真菌与宿主植物相互作用的相关基因研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员