基于切片的概率程序验证中的错误定位、证明与提示（扩展版） (Error Localization, Certificates, and Hints for Probabilistic Program Verification via Slicing (Extended Version))

This paper focuses on effective user diagnostics generated during the deductive verification of probabilistic programs. Our key principle is based on providing slices for (1) error reporting, (2) proof simplification, and (3) preserving successful verification results. By formally defining these different notions on HeyVL, an existing quantitative intermediate verification language (IVL), our concepts (and implementation) can be used to obtain diagnostics for a range of probabilistic programming languages. Slicing for error reporting is a novel notion of error localization for quantitative assertions. We demonstrate slicing-based diagnostics on a variety of proof rules such as quantitative versions of the specification statement and invariant-based loop rules, and formally prove the correctness of specialized error messages and verification hints. We implemented our user diagnostics into the deductive verifier Caesar. Our novel implementation -- called \emph{Brutus} -- can search for slices which do or do not verify, corresponding to each of the three diagnostic notions. For error reporting (1), it exploits a binary search-based algorithm that minimizes error-witnessing slices. To solve for slices that verify (2 and 3), we empirically compare different algorithms based on unsatisfiable cores, minimal unsatisfiable subset enumeration, and a direct SMT encoding of the slicing problem. Our empirical evaluation of Brutus on existing and new benchmarks shows that we can find slices that are both small and informative.

翻译：本文聚焦于概率程序演绎验证过程中生成的有效用户诊断信息。我们的核心原理基于为（1）错误报告、（2）证明简化及（3）保持成功验证结果提供程序切片。通过在现有定量中间验证语言HeyVL上形式化定义这些不同概念，我们的理论框架（及实现）可用于为多种概率编程语言生成诊断信息。面向错误报告的切片是针对定量断言错误定位的全新概念。我们在多种验证规则（如定量化规范语句与基于不变式的循环规则）上展示了基于切片的诊断方法，并形式化证明了专用错误信息与验证提示的正确性。我们将用户诊断功能集成至演绎验证器Caesar中。这一名为Brutus的创新实现能够搜索验证通过或失败的切片，分别对应三种诊断概念。对于错误报告（1），系统采用基于二分搜索的算法以最小化错误见证切片。为求解验证通过的切片（2与3），我们通过实验比较了基于不可满足核、最小不可满足子集枚举及切片问题直接SMT编码的不同算法。在现有及新基准测试中对Brutus的实证评估表明，我们能够找到既精简又信息丰富的程序切片。