Uplift modeling estimates the causal effect of an intervention as the difference between potential outcomes under treatment and control, whereas counterfactual identification aims to recover the joint distribution of these potential outcomes (e.g., "Would this customer still have churned had we given them a marketing offer?"). This joint counterfactual distribution provides richer information than the uplift but is harder to estimate. However, the two approaches are synergistic: uplift models can be leveraged for counterfactual estimation. We propose a counterfactual estimator that fits a bivariate beta distribution to predicted uplift scores, yielding posterior distributions over counterfactual outcomes. Our approach requires no causal assumptions beyond those of uplift modeling. Simulations show the efficacy of the approach, which can be applied, for example, to the problem of customer churn in telecom, where it reveals insights unavailable to standard ML or uplift models alone.
翻译:提升建模通过估计干预措施下处理组与对照组的潜在结果差异来评估因果效应,而反事实识别则旨在恢复这些潜在结果的联合分布(例如,“若我们向该客户提供营销优惠,其是否仍会流失?”)。这种联合反事实分布较提升效应提供了更丰富的信息,但估计难度更高。然而,这两种方法具有协同性:提升模型可被用于反事实估计。我们提出一种反事实估计器,通过将二元贝塔分布拟合至预测的提升分数,生成反事实结果的后验分布。该方法除提升建模所需假设外,无需额外因果假设。仿真实验验证了该方法的有效性,例如可应用于电信客户流失问题,揭示标准机器学习或单一提升模型无法获取的深层洞察。