Raw lines of code (LOC) is a metric that does not, at first glance, seem extremely useful for automated test generation. It is both highly language-dependent and not extremely meaningful, semantically, within a language: one coder can produce the same effect with many fewer lines than another. However, relative LOC, between components of the same project, turns out to be a highly useful metric for automated testing. In this paper, we make use of a heuristic based on LOC counts for tested functions to dramatically improve the effectiveness of automated test generation. This approach is particularly valuable in languages where collecting code coverage data to guide testing has a very high overhead.We apply the heuristic to property-based Python testing using the TSTL (Template Scripting Testing Language) tool. In our experiments, the simple LOC heuristic can improve branch and statement coverage by large margins (often more than 20%, up to 40% or more), and improve fault detection by an even larger margin (usually more than 75%, and up to 400% or more). The LOC heuristic is also easy to combine with other approaches, and is comparable to, and possibly more effective than, two well-established approaches for guiding random testing.
翻译:原始代码行( LOC) 是一种衡量标准, 初看起来对自动测试生成并不十分有用。 它在语言中既高度依赖语言, 也非极不具有意义, 语义中具有高度意义: 一个编码器可以产生同样的效果, 其线条比其他的线条要少得多。 然而, 相对的 LOC, 在同一工程的各个组成部分之间, 事实证明它是一个非常有用的自动测试指标。 在本文中, 我们使用基于 LOC 的 LOC 计数的超常法, 来大幅提高自动测试生成的效能。 在收集用于指导测试的代码覆盖数据的语言中, 这种方法特别有用 。 我们使用基于属性的 Python 测试, 使用 TSTL ( Templetroptinginging Temping TectionL) 工具。 但是, 在我们的实验中, 简单的 LOC 值可以提高大边距的分支和语句覆盖度( 通常超过 20%, 高达 40% 或 以上 ), 用更大的边距来改进错误的探测( 通常超过 75%, 达 400 或 以上 以上 ) 。 LOC 。 。 LOC 也容易与其他方法合并, 。