Many areas of science make extensive use of computer simulators that implicitly encode likelihood functions of complex systems. Classical statistical methods are poorly suited for these so-called likelihood-free inference (LFI) settings, outside the asymptotic and low-dimensional regimes. Although new machine learning methods, such as normalizing flows, have revolutionized the sample efficiency and capacity of LFI methods, it remains an open question whether they produce reliable measures of uncertainty. This paper presents a statistical framework for LFI that unifies classical statistics with modern machine learning to: (1) efficiently construct frequentist confidence sets and hypothesis tests with finite-sample guarantees of nominal coverage (type I error control) and power; (2) provide practical diagnostics for assessing empirical coverage over the entire parameter space. We refer to our framework as likelihood-free frequentist inference (LF2I). Any method that estimates a test statistic, like the likelihood ratio, can be plugged into our framework to create valid confidence sets and compute diagnostics, without costly Monte Carlo samples at fixed parameter settings. In this work, we specifically study the power of two test statistics (ACORE and BFF), which, respectively, maximize versus integrate an odds function over the parameter space. Our study offers multifaceted perspectives on the challenges in LF2I.
翻译:科学的许多领域都广泛使用计算机模拟器,这些模拟器隐含了复杂系统的概率功能。古典统计方法不适合于这些所谓的无概率推断(LFI),在无症状和低维系统之外,这些所谓的无概率推断(LFI)环境。虽然新的机器学习方法,例如正常流动,使样本效率和LFI方法的能力发生了革命性的变化,但是,它们是否产生可靠的不确定性的计量方法仍然是一个未决问题。本文件为LFI提供了一个统计框架,将经典统计数据与现代机器学习相结合:(1) 有效地建立常客信心组和假设测试,对名义覆盖(第一类错误控制)和权力进行有限的抽样保证;(2) 为评估整个参数空间的经验覆盖提供实用的诊断。我们称我们的框架为无概率推断(LF2I)。 任何估算测试统计方法,如可能性比率,都可以插入我们的框架,以创建有效的信任组和计算诊断结果,而无需在固定参数环境中花费昂贵的蒙特卡洛样本。在这项工作中,我们专门研究两种测试统计数据(ACORE和BFF2)的力量,分别对空间的多维度提供了我们空间的参数。