Probabilistic graphical models are a fundamental tool in probabilistic modeling, machine learning and artificial intelligence. They allow us to integrate in a natural way expert knowledge, physical modeling, heterogeneous and correlated data and quantities of interest. For exactly this reason, multiple sources of model uncertainty are inherent within the modular structure of the graphical model. In this paper we develop information-theoretic, robust uncertainty quantification methods and non-parametric stress tests for directed graphical models to assess the effect and the propagation through the graph of multi-sourced model uncertainties to quantities of interest. These methods allow us to rank the different sources of uncertainty and correct the graphical model by targeting its most impactful components with respect to the quantities of interest. Thus, from a machine learning perspective, we provide a mathematically rigorous approach to correctability that guarantees a systematic selection for improvement of components of a graphical model while controlling potential new errors created in the process in other parts of the model. We demonstrate our methods in two physico-chemical examples, namely quantum scale-informed chemical kinetics and materials screening to improve the efficiency of fuel cells.
翻译:概率图形模型是概率建模、机器学习和人工智能的基本工具,使我们能自然地将专家知识、物理建模、多元和相互关联的数据和兴趣量综合在一起。正因为如此,图形模型的模块结构中固有的是模型不确定性的多种来源。在本文件中,我们为定向图形模型开发了信息理论、稳健的不确定性量化方法和非参数压力测试,以便通过多来源模型不确定性图评估其影响和扩散到一定数量的兴趣中。这些方法使我们能够对各种不确定性来源进行分级,并通过根据兴趣量选择其影响最大的组成部分来纠正图形模型。因此,从机器学习的角度,我们提供了一种数学上严格的校正性方法,保证系统地选择改进图形模型的组件,同时控制在模型其他部分的进程中产生的潜在新错误。我们用两种物理化学实例,即量尺度知情化学动能学和材料筛选来提高燃料电池的效率,来展示我们的方法。