Previous studies have shown that neural machine translation (NMT) models can benefit from modeling translated (Past) and un-translated (Future) source contents as recurrent states (Zheng et al., 2018). However, the recurrent process is less interpretable. In this paper, we propose to model Past and Future by Capsule Network (Hinton et al.,2011), which provides an explicit separation of source words into groups of Past and Future by the process of parts-to-wholes assignment. The assignment is learned with a novel variant of routing-by-agreement mechanism (Sabour et al., 2017), namely Guided Dynamic Routing, in which what to translate at current decoding step guides the routing process to assign each source word to its associated group represented by a capsule, and to refine the representation of the capsule dynamically and iteratively. Experiments on translation tasks of three language pairs show that our model achieves substantial improvements over both RNMT and Transformer. Extensive analysis further verifies that our method does recognize translated and untranslated content as expected, and produces better and more adequate translations.
翻译:先前的研究显示,神经机翻译模式可以受益于作为经常性状态(Zheng等人,2018年)的翻译(平版)和未翻译(Future)源内容的模型(NMT),但是,经常程序不易解释。在本文件中,我们提议由Capsule 网络(Hinton等人,2011年)对过去和将来各组进行建模,该模型通过从部件到整体任务的过程,将源词明确分为过去和未来的各组。该模型学习的是按协议机制(Sabour等人,2017年)的新版本,即 " 方向动态路由 ",其中,在目前的解码步骤中,要翻译的内容引导路由进程,将每个源词指派给以胶囊为代表的关联组,并以动态和迭接方式完善胶囊的表述。关于三对语言翻译任务的实验表明,我们的模型在RNMT和变换器方面都取得了重大改进。深入的分析进一步证实,我们的方法确实将翻译和未翻译的内容确认为预期的内容,并产生更好和更充分的翻译。