Generalization and reliability of multilingual translation often highly depend on the amount of available parallel data for each language pair of interest. In this paper, we focus on zero-shot generalization---a challenging setup that tests models on translation directions they have not been optimized for at training time. To solve the problem, we (i) reformulate multilingual translation as probabilistic inference, (ii) define the notion of zero-shot consistency and show why standard training often results in models unsuitable for zero-shot tasks, and (iii) introduce a consistent agreement-based training method that encourages the model to produce equivalent translations of parallel sentences in auxiliary languages. We test our multilingual NMT models on multiple public zero-shot translation benchmarks (IWSLT17, UN corpus, Europarl) and show that agreement-based learning often results in 2-3 BLEU zero-shot improvement over strong baselines without any loss in performance on supervised translation directions.
翻译:多语种翻译的普遍化和可靠性往往在很大程度上取决于每种语文相关对应语文的现有平行数据的数量。在本文件中,我们侧重于零点一般化 -- -- 一个挑战性的设置,测试在培训时尚未优化的翻译方向模式。为了解决问题,我们(一) 将多语种翻译改写为概率推论,(二) 界定零点一致性的概念,并表明标准培训往往产生不适合零点任务的模式,以及(三) 采用一致的基于协议的培训方法,鼓励该模式以辅助语文制作等量的平行句子译文。我们用多语种的多语种NMT模式测试多语种公共零点翻译基准(IWSLT17、UNCamp17、Eurparl),并表明基于协议的学习往往导致2,3 BLEU零点改进强基线,而没有损损及监督翻译方向的业绩。