Causality is a fundamental part of the scientific endeavour to understand the world. Unfortunately, causality is still taboo in much of psychology and social science. Motivated by a growing number of recommendations for the importance of adopting causal approaches to research, we reformulate the typical approach to research in psychology to harmonize inevitably causal theories with the rest of the research pipeline. We present a new process which begins with the incorporation of techniques from the confluence of causal discovery and machine learning for the development, validation, and transparent formal specification of theories. We then present methods for reducing the complexity of the fully specified theoretical model into the fundamental submodel relevant to a given target hypothesis. From here, we establish whether or not the quantity of interest is estimable from the data, and if so, propose the use of semi-parametric machine learning methods for the estimation of causal effects. The overall goal is the presentation of a new research pipeline which can (a) facilitate scientific inquiry compatible with the desire to test causal theories (b) encourage transparent representation of our theories as unambiguous mathematical objects, (c) to tie our statistical models to specific attributes of the theory, thus reducing under-specification problems frequently resulting from the theory-to-model gap, and (d) to yield results and estimates which are causally meaningful and reproducible. The process is demonstrated through didactic examples with real-world data, and we conclude with a summary and discussion of limitations.
翻译:不幸的是,因果关系仍然是心理学和社会科学中许多心理学和社会科学基本模式的禁忌。由于对研究采取因果方法的重要性提出了越来越多的建议,我们重新制定了典型的心理学研究方法,以将不可避免的因果理论与研究管道的其余部分统一起来。我们提出了一个新进程,首先是将因果发现和机器学习相结合的技术结合到理论的开发、验证和透明的正式规格中。然后我们提出一些方法,将完全指定的理论模型的复杂程度降低到与给定目标假设相关的基本子模型中去。从这里开始,我们确定对研究的兴趣是否可从数据中估定,如果是的话,我们建议使用半参数机学习方法来估计因果影响。总的目标是提出一个新的研究管道,以便(a) 便利科学调查与检验因果理论的愿望相容;(b) 鼓励以透明的方式将我们的理论表述作为明确的数学对象,(c) 将我们的统计模型与理论的具体属性联系起来,从而减少从数据中量化不足的数量,如果是这样的话,那么从理论的理论的模型和理论的模型的模型中往往将产生结果和结果。