Autoregressive generative models naturally generate variable-length sequences, while non-autoregressive models struggle, often imposing rigid, token-wise structures. We propose Edit Flows, a non-autoregressive model that overcomes these limitations by defining a discrete flow over sequences through edit operations$\unicode{x2013}$insertions, deletions, and substitutions. By modeling these operations within a Continuous-time Markov Chain over the sequence space, Edit Flows enable flexible, position-relative generation that aligns more closely with the structure of sequence data. Our training method leverages an expanded state space with auxiliary variables, making the learning process efficient and tractable. Empirical results show that Edit Flows outperforms both autoregressive and mask models on image captioning and significantly outperforms the mask construction in text and code generation.
翻译:自回归生成模型自然地生成可变长度序列,而非自回归模型则难以做到这一点,通常强加僵化的词元级结构。我们提出编辑流,这是一种非自回归模型,它通过定义基于编辑操作(插入、删除和替换)的序列离散流,克服了这些限制。通过在序列空间上的连续时间马尔可夫链中对这些操作进行建模,编辑流实现了灵活的、与位置相关的生成,更紧密地契合序列数据的结构。我们的训练方法利用带有辅助变量的扩展状态空间,使学习过程高效且易于处理。实验结果表明,在图像描述任务上,编辑流优于自回归模型和掩码模型;在文本和代码生成任务上,编辑流显著优于掩码构造方法。