We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We refer to this task as the EditCompletion task and present a novel approach for tackling it. The main idea is to directly represent structural edits. This allows us to model the likelihood of the edit itself, rather than learning the likelihood of the edited code. We represent an edit operation as a path in the program's Abstract Syntax Tree (AST), originating from the source of the edit to the target of the edit. Using this representation, we present a powerful and lightweight neural model for the EditCompletion task. We conduct a thorough evaluation, comparing our approach to a variety of representation and modeling approaches that are driven by multiple strong models such as LSTMs, Transformers, and neural CRFs. Our experiments show that our model achieves a 28% relative gain over state-of-the-art sequential models and 2x higher accuracy than syntactic models that learn to generate the edited code, as opposed to modeling the edits directly. Our code, dataset, and trained models are publicly available at https://github.com/tech-srl/c3po/ .
翻译:我们处理的是如何根据一个经过过去编辑培训的学习模型预测编辑完成的问题。 鉴于一个经过部分编辑的代码片断, 我们的目标是预测完成该片段其余部分的编辑。 我们将此任务称为编辑完整任务, 并提出处理它的新颖方法。 主要的想法是直接代表结构编辑。 这使我们能够模拟编辑本身的可能性, 而不是学习编辑代码的可能性。 我们代表一个编辑操作, 作为程序“ 抽象语法树 ” ( AST) 的一个路径, 来源于编辑到编辑目标的源头。 我们使用这个表示, 我们为编辑完整任务展示一个强大和轻量级的神经模型。 我们进行彻底评估, 比较我们的方法, 以多种强大的模型如 LSTMS、 变换器和 神经通用报告格式驱动的表达和建模方法。 我们的实验显示, 我们的模型在状态- 艺术顺序模型和2x比合成模型的精度要高28% 。 使用这个模型来学习编辑/ 正在直接开发的数据模型 。