This paper describes Facebook FAIR's submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model reranking. Our submissions are ranked first in all four directions of the human evaluation campaign. On En->De, our system significantly outperforms other systems as well as human translations. This system improves upon our WMT'18 submission by 4.5 BLEU points.
翻译:本文描述了 Facebook FAIR 向 WMT19 共享新闻翻译任务提交的呈件。 我们以两种语言对和四种语言方向参与, 即英语 < - > 德语和英语 < - > 俄语。 在我们去年提交呈件后, 我们的基线系统是大型 BPE 基变压器模型模型, 使用Fairseq 序列模型工具包培训, 依靠抽样反译工具。 今年我们实验了不同的位数数据过滤方案, 并添加了过滤的回译数据。 我们还联合并精细调整了我们关于特定域数据的模型, 然后用吵闹的频道模型重新排位。 我们的呈件在人类评估运动的所有四个方向中排名第一。 在 En > De 上, 我们的系统大大超越了其他系统以及人类翻译。 该系统在我们WMT 18 提交4.5 BLEU 点时得到了改进 。