Most bug assignment approaches utilize text classification and information retrieval techniques. These approaches use the textual contents of bug reports to build recommendation models. The textual contents of bug reports are usually of high dimension and noisy source of information. These approaches suffer from low accuracy and high computational needs. In this paper, we investigate whether using categorical fields of bug reports, such as component to which the bug belongs, are appropriate to represent bug reports instead of textual description. We build a classification model by utilizing the categorical features, as a representation, for the bug report. The experimental evaluation is conducted using three projects namely NetBeans, Freedesktop, and Firefox. We compared this approach with two machine learning based bug assignment approaches. The evaluation shows that using the textual contents of bug reports is important. In addition, it shows that the categorical features can improve the classification accuracy.
翻译:多数错误分配方法使用文本分类和信息检索技术。 这些方法使用错误报告的文字内容来建立建议模型。 错误报告的文字内容通常具有高维和吵闹的信息来源。 这些方法有低精度和高计算需要。 在本文中, 我们调查使用错误报告绝对字段, 如错误所属的组成部分, 是否适合代表错误报告, 而不是文字描述。 我们用错误报告的绝对特征来构建分类模型。 实验性评估使用三个项目进行, 即 NetBeans、 Freedesktop 和 Firefox。 我们比较了这个方法与两个机器学习错误分配方法。 评估显示, 使用错误报告的文字内容很重要。 此外, 它表明, 绝对特征可以提高分类的准确性 。