Traces of user interactions with a software system, captured in production, are commonly used as an input source for user experience testing. In this paper, we present an alternative use, introducing a novel approach of modeling user interaction traces enriched with another type of data gathered in production - software fault reports consisting of software exceptions and stack traces. The model described in this paper aims to improve developers' comprehension of the circumstances surrounding a specific software exception and can highlight specific user behaviors that lead to a high frequency of software faults. Modeling the combination of interaction traces and software crash reports to form an interpretable and useful model is challenging due to the complexity and variance in the combined data source. Therefore, we propose a probabilistic unsupervised learning approach, adapting the Nested Hierarchical Dirichlet Process, which is a Bayesian non-parametric topic model commonly applied to natural language data. This model infers a tree of topics, each of whom describes a set of commonly co-occurring commands and exceptions. The topic tree can be interpreted hierarchically to aid in categorizing the numerous types of exceptions and interactions. We apply the proposed approach to large scale datasets collected from the ABB RobotStudio software application, and evaluate it both numerically and with a small survey of the RobotStudio developers.
翻译:制作过程中捕捉到的用户与软件系统互动的痕迹通常用作用户经验测试的输入源。在本文中,我们介绍了一种替代方法,即以制作过程中收集的另一种数据 -- -- 软件故障报告,包括软件例外和堆叠痕迹,来对用户互动的痕迹进行模型化。本文描述的模式旨在提高开发者对特定软件例外情形的理解,并能够突出导致软件故障高发率的具体用户行为。将互动痕迹和软件崩溃报告结合起来,形成一个可解释和有用的模型,由于综合数据源的复杂性和差异,这具有挑战性。因此,我们建议采用一种概率性、非监督性、非监督的学习方法,调整Nesteded Hirarchical Dirichlet 进程,这是巴伊西亚非参数模型,通常适用于自然语言数据。这一模型推导出一棵主题,每个主题都描述一套常见的指令和例外。主题树可以按等级解释,以帮助对众多类型的例外和互动进行分类。因此,我们建议采用一种概率性非监督性的非监督性学习方法,即对大型的机器人和小型数据库应用。