Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount of labeled data that is costly to collect. To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking (DB). To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair (QQP) and the ParaNMT datasets and is robust to domain shift between the two datasets of distinct distributions. We also demonstrate that our model transfers to paraphrasing in other languages without any additional finetuning.
翻译:在设计培训目标和模型结构方面最近取得的进展使原译版本的生成大为获益。然而,以往的探索主要侧重于监督方法,需要大量收集昂贵的标签数据。为解决这一缺陷,我们采取了转让学习办法,并提议了一个培训管道,使经过培训的语言模型能够在不受监督的环境中产生高质量的原译。我们的配方包括任务适应、自我监督以及名为动态屏蔽(DB)的新型解码算法。每当语言模型发出源序列中的标语时,为了执行一种与输入不同的表面形式,DB防止该模型输出下一代步骤的下一个源符号。我们通过自动和人文评估显示,我们的方法在Quora Quora Question Pair(QP) 和 ParaNMT 数据集上都达到了最先进的表现。我们还表明,我们的模式在不作任何进一步微调的情况下将其他语文的原译法转换为副码。