解释治疗效果模拟器之间的实际差别 (Explaining Practical Differences Between Treatment Effect Estimators with High Dimensional Asymptotics)

We revisit the classical causal inference problem of estimating the average treatment effect in the presence of fully observed confounding variables using two-stage semiparametric methods. In existing theoretical studies of methods such as G-computation, inverse propensity weighting (IPW), and two common doubly robust estimators -- augmented IPW (AIPW) and targeted maximum likelihood estimation (TMLE) -- they are either bias-dominated, or have similar asymptotic statistical properties. However, when applied to real datasets, they often appear to have notably different variance. We compare these methods when using a machine learning (ML) model to estimate the nuisance parameters of the semiparametric model, and highlight some of the important differences. When the outcome model estimates have little bias, which is common among some key ML models, G-computation and the TMLE outperforms the other estimators in both bias and variance. We show that the differences can be explained using high-dimensional statistical theory, where the number of confounders $d$ is of the same order as the sample size $n$. To make this theoretical problem tractable, we posit a generalized linear model for the effect of the confounders on the treatment assignment and outcomes. Despite making parametric assumptions, this setting is a useful surrogate for some machine learning methods used to adjust for confounding in two-stage semiparametric methods. In particular, the estimation of the first stage adds variance that does not vanish, forcing us to confront terms in the asymptotic expansion that normally are brushed aside as finite sample defects. However, our model emphasizes differences in performance between these estimators beyond first-order asymptotics.

翻译：我们重新审视了在充分观察到的分解变量存在的情况下估计平均处理效果的典型因果推论问题。在使用两阶段半参数方法来评估平均处理效果时,我们用两阶段半参数模型来比较这些方法。在对G-计算、反偏向加权(IPW)等方法的现有理论研究中,以及两个共同的双倍强估测器 -- -- 强化了IPW(AIPW)和有针对性的最大概率估算(TMLE) -- -- 它们要么是偏差主导,要么是类似的偏差统计属性。但是,在应用真实数据集时,它们似乎往往有显著的差异。在使用机器学习模型(MLM)模型时,我们比较这些方法的方法,在使用机器学习模型(MLM)模型(ML)模型(MLM)模型(ML)模型来估计半参数的偏差参数,并突出一些重要的差异。当结果模型(G-IPF)估计结果时,G-C-G-C和TML(TML)通常的偏差比其他估测算方法。我们使用的这些模型可以用高度估算值来解释差异。在高估测算的模型中可以解释差异上,我们比较的值数字的值数字的数值数是第一个阶段的排序,对于测算法,通常是用来测测测测测算。