Simultaneous machine translation (SiMT) outputs the translation while receiving the source inputs, and hence needs to balance the received source information and translated target information to make a reasonable decision between waiting for inputs or outputting translation. Previous methods always balance source and target information at the token level, either directly waiting for a fixed number of tokens or adjusting the waiting based on the current token. In this paper, we propose a Wait-info Policy to balance source and target at the information level. We first quantify the amount of information contained in each token, named info. Then during simultaneous translation, the decision of waiting or outputting is made based on the comparison results between the total info of previous target outputs and received source inputs. Experiments show that our method outperforms strong baselines under and achieves better balance via the proposed info.
翻译:同时的机器翻译(SimMT) 在接收源投入的同时输出翻译结果,因此需要平衡收到的源信息和翻译目标信息,以便在等待输入或输出翻译之间做出合理决定。 以往的方法总是在象征性水平上平衡源和目标信息, 要么直接等待固定的标牌数量, 要么根据当前标牌调整等待时间。 在本文中, 我们提出了一个等待信息政策, 在信息水平上平衡源和目标。 我们首先量化每个标牌( 命名信息) 所含信息的数量。 然后, 在同时翻译过程中, 根据先前目标产出和收到源投入的信息总量之间的比较结果, 做出等待或输出决定。 实验显示,我们的方法在当前的标牌下超过了强大的基线, 通过拟议信息实现更好的平衡。