Uncertainty quantification is a key part of astronomy and physics; scientific researchers attempt to model both statistical and systematic uncertainties in their data as best as possible, often using a Bayesian framework. Decisions might then be made on the resulting uncertainty quantification -- perhaps whether or not to believe in a certain theory, or whether to take certain actions. However it is well known that most statistical claims should be taken contextually; even if certain models are excluded at a very high degree of confidence, researchers are typically aware there may be systematics that were not accounted for, and thus typically will require confirmation from multiple independent sources before any novel results are truly accepted. In this paper we compare two methods in the astronomical literature that seek to attempt to quantify these `unknown unknowns' -- in particular attempting to produce realistic thick tails in the posterior of parameter estimation problems, that account for the possible existence of very large unknown effects. We test these methods on a series of case studies, and discuss how robust these methods would be in the presence of malicious interference with the scientific data.
翻译:不确定性的量化是天文学和物理学的一个关键部分;科学研究人员试图尽可能地用贝叶斯框架来模拟其数据中的统计性和系统性不确定性,通常使用贝叶斯框架。然后可能就由此产生的不确定性的量化作出决定 -- -- 也许是否相信某一理论,或者是否采取某些行动。然而,众所周知,大多数统计主张应当根据具体情况进行;即使某些模型被非常自信地排除,研究人员通常知道可能存在一些没有说明原因的系统,因此通常需要多个独立来源的确认,然后才能真正接受任何新的结果。在本文件中,我们比较了天文学文献中试图量化这些“未知未知”的两种方法,特别是试图在参数估计问题的表面产生现实的厚尾巴,这说明可能存在非常未知的影响。我们用一系列案例研究来测试这些方法,并讨论这些方法在出现恶意干扰科学数据的情况下会有多强。