Synguar: 保证在方案编制中普遍化的示例 (SynGuar: Guaranteeing Generalization in Programming by Example)

Programming by Example (PBE) is a program synthesis paradigm in which the synthesizer creates a program that matches a set of given examples. In many applications of such synthesis (e.g., program repair or reverse engineering), we are to reconstruct a program that is close to a specific target program, not merely to produce some program that satisfies the seen examples. In such settings, we wish that the synthesized program generalizes well, i.e., has as few errors as possible on the unobserved examples capturing the target function behavior. In this paper, we propose the first framework (called SynGuar) for PBE synthesizers that guarantees to achieve low generalization error with high probability. Our main contribution is a procedure to dynamically calculate how many additional examples suffice to theoretically guarantee generalization. We show how our techniques can be used in 2 well-known synthesis approaches: PROSE and STUN (synthesis through unification), for common string-manipulation program benchmarks. We find that often a few hundred examples suffice to provably bound generalization error below $5\%$ with high ($\geq 98\%$) probability on these benchmarks. Further, we confirm this empirically: SynGuar significantly improves the accuracy of existing synthesizers in generating the right target programs. But with fewer examples chosen arbitrarily, the same baseline synthesizers (without SynGuar) overfit and lose accuracy.

翻译：示例( PBE) 编程是一种程序综合范例( PBE), 合成器在其中创建了一个与一组特定实例相匹配的程序。在这种合成器的许多应用中( 例如, 程序修复或反向工程), 我们要做的是重建一个接近特定目标程序的程序, 不仅仅是产生符合所看到的例子的某个程序。在这样的环境下, 我们希望综合程序在捕获目标函数行为未观察到的示例上尽可能少有误差。在本文中, 我们为 PBE 合成器提出了第一个框架( 称为 SynGuar), 保证实现低一般化错误的可能性很高。我们的主要贡献是用动态计算更多实例的程序, 足以在理论上保证概括化。我们展示了如何在两种众所周知的合成方法中使用我们的技术: PROSE 和 STUN (通过统一合成合成), 用于共同的字符串调程序基准。我们发现, 通常有几百个实例足以将常规误差限制在5 ⁇ $以下, 高 $\ 98\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\