A growing demand is witnessed in both industry and academia for employing Deep Learning (DL) in various domains to solve real-world problems. Deep Reinforcement Learning (DRL) is the application of DL in the domain of Reinforcement Learning (RL). Like any software systems, DRL applications can fail because of faults in their programs. In this paper, we present the first attempt to categorize faults occurring in DRL programs. We manually analyzed 761 artifacts of DRL programs (from Stack Overflow posts and GitHub issues) developed using well-known DRL frameworks (OpenAI Gym, Dopamine, Keras-rl, Tensorforce) and identified faults reported by developers/users. We labeled and taxonomized the identified faults through several rounds of discussions. The resulting taxonomy is validated using an online survey with 19 developers/researchers. To allow for the automatic detection of faults in DRL programs, we have defined a meta-model of DRL programs and developed DRLinter, a model-based fault detection approach that leverages static analysis and graph transformations. The execution flow of DRLinter consists in parsing a DRL program to generate a model conforming to our meta-model and applying detection rules on the model to identify faults occurrences. The effectiveness of DRLinter is evaluated using 15 synthetic DRLprograms in which we injected faults observed in the analyzed artifacts of the taxonomy. The results show that DRLinter can successfully detect faults in all synthetic faulty programs.
翻译:行业和学术界对在各个领域使用深学习(DL)以解决现实世界问题的需求不断增长。深强化学习(DRL)是应用DL在强化学习(RL)领域的应用。像任何软件系统一样,DRL应用程序可能由于程序错误而失败。在本文件中,我们首次尝试对DRL程序中出现的错误进行分类。我们手工分析了761件DRL程序(来自Stack interproversion poss and GitHub sublies)的手工艺(来自Stack interproflow plents and GitHub subs),这是利用众所周知的DRL框架(Open ty Gym, Dopam, Keras-rl, Tensorforce)开发的DRL(DL)框架(ODL),这是开发DR(DL)的基于模型的错误检测方法,这是在15 MAL 规则中利用静态分析法流和图表转换流程中,我们对结果进行了测试。