Traditionally, mutation testing generates an abundance of small deviations of a program, called mutants. At industrial systems the scale and size of Facebook's, doing this is infeasible. We should not create mutants that the test suite would likely fail on or that give no actionable signal to developers. To tackle this problem, in this paper, we semi-automatically learn error-inducing patterns from a corpus of common Java coding errors and from changes that caused operational anomalies at Facebook specifically. We combine the mutations with instrumentation that measures which tests exactly visited the mutated piece of code. Results on more than 15,000 generated mutants show that more than half of the generated mutants survive Facebook's rigorous test suite of unit, integration, and system tests. Moreover, in a case study with 26 developers, all but two found information of automatically detected test holes interesting in principle. As such, almost half of the 26 would actually act on the mutant presented to them by adapting an existing or creating a new test. The others did not for a variety of reasons often outside the scope of mutation testing. It remains a practical challenge how we can include such external information to increase the true actionability rate on mutants.
翻译:传统上, 突变测试产生一个程序, 叫做变异体的微小偏差。 在工业系统中, Facebook的规模和大小是行不通的。 我们不应该创建测试套件可能失灵或不会给开发者带来可操作信号的变异体。 为了解决这个问题, 在本文中, 我们半自动地从一个共同的 Java 编码错误中学习出错诱导模式, 并且从具体导致脸书操作异常的变化中学习。 我们把突变与测试精确访问变异代码部分的仪器结合起来。 超过 15 000 个生成变异体的结果显示, 超过 半数的变异体在Facebook 的精密单元、 整合和系统测试套中存活下来。 此外, 在与 26 个开发者进行的案例研究中, 除了两个外, 都发现了关于自动检测测试孔的信息, 原则上很有意思。 因此, 26 中几乎一半的变异体会通过调整现有测试或创建新测试来对它们采取行动。 其它的测试并非出于各种原因, 突变异体测试范围外。 。 我们仍然面临一个实际的挑战, 我们如何将这种外部信息纳入变种变异体的变体。