抽样偷窥:致盲结果既无必要,也不够充分 (Treatment Effect Bias from Sample Snooping: Blinding Outcomes is Neither Necessary nor Sufficient) - 专知论文

会员服务 ·

0

有偏 · Guidance · INFORMS · 样本 · CASES ·

2021 年 4 月 20 日

Treatment Effect Bias from Sample Snooping: Blinding Outcomes is Neither Necessary nor Sufficient

翻译：抽样偷窥:致盲结果既无必要,也不够充分

from arxiv, version notes: dramatic rewrite with new theoretical results and simulations. Most of the technical results from first version are now contained in the supplement, with new results taking their place in the main text. The supplement is available as an ancillary file (see link on right-hand side of this page)

Popular guidance on observational data analysis states that outcomes should be blinded when determining matching criteria or propensity scores. Such a blinding is informally said to maintain the "objectivity" of the analysis, and to prevent analysts from artificially amplifying the treatment effect by exploiting chance imbalances. Contrary to this notion, we show that outcome blinding is not a sufficient safeguard against fishing. Blinded and unblinded analysts can produce bias of the same order of magnitude in cases where the outcomes can be approximately predicted from baseline covariates. We illustrate this vulnerability with a combination of analytical results and simulations. Finally, to show that outcome blinding is not necessary to prevent bias, we outline an alternative sample partitioning procedure for estimating the average treatment effect on the controls, or the average treatment effect on the treated. This procedure uses all of the the outcome data from all partitions in the final analysis step, but does not require the analysis to not be fully prespecified.

翻译：关于观测数据分析的大众指导指出,在确定匹配标准或倾向分数时,结果应当被蒙蔽。这种盲点被非正式地说保持分析的“客观性”,防止分析师利用机会不平衡人为扩大治疗效果。与此相反,我们表明,结果盲点不足以防止捕鱼。盲点和非盲点分析师在从基准变量中大致预测结果时,可产生相同数量级的偏差。我们用分析结果和模拟相结合的方式来说明这种脆弱性。最后,为了证明结果盲点对于防止偏差没有必要,我们概述了一种替代样本分割程序,用以估计对控制的平均治疗效果,或对被治疗者的平均治疗效果。这一程序使用了最终分析步骤中所有分区的结果数据,但并不要求分析不完全提前确定。

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

面向知识图谱的信息抽取

专知会员服务

200+阅读 · 2020年10月14日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

【O'Reilly AI Conference 2019】通过卫星图像预测生活质量（ Predicting the quality of life from satellite imagery），Gramener，Ganes Kesari、Soumya Ranjan

【O'Reilly AI Conference 2019】通过卫星图像预测生活质量（ Predicting the quality of life from satellite imagery），Gramener，Ganes Kesari、Soumya Ranjan

专知会员服务

10+阅读 · 2019年11月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Measuring the sensitivity of Gaussian processes to kernel choice

Arxiv

0+阅读 · 2021年6月11日

Calibrate Before Use: Improving Few-Shot Performance of Language Models

Arxiv

0+阅读 · 2021年6月10日

A tutorial on individualized treatment effect prediction from randomized trials with a binary endpoint

Arxiv

0+阅读 · 2021年6月10日

Data augmentation in Bayesian neural networks and the cold posterior effect

Arxiv

0+阅读 · 2021年6月10日

Gaussian Prepivoting for Finite Population Causal Inference

Arxiv

0+阅读 · 2021年6月9日

Robust Prediction Interval estimation for Gaussian Processes by Cross-Validation method

Arxiv

0+阅读 · 2021年6月9日

Gaussian Process Nowcasting: Application to COVID-19 Mortality Reporting

Arxiv

0+阅读 · 2021年6月9日

Deconditional Downscaling with Gaussian Processes

Arxiv

1+阅读 · 2021年6月5日

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Arxiv

0+阅读 · 2021年4月16日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

面向知识图谱的信息抽取

专知会员服务

200+阅读 · 2020年10月14日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【机器学习术语宝典】机器学习中英文术语表

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

【O'Reilly AI Conference 2019】通过卫星图像预测生活质量（ Predicting the quality of life from satellite imagery），Gramener，Ganes Kesari、Soumya Ranjan

【O'Reilly AI Conference 2019】通过卫星图像预测生活质量（ Predicting the quality of life from satellite imagery），Gramener，Ganes Kesari、Soumya Ranjan

专知会员服务

10+阅读 · 2019年11月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Measuring the sensitivity of Gaussian processes to kernel choice

Arxiv

0+阅读 · 2021年6月11日

Calibrate Before Use: Improving Few-Shot Performance of Language Models

Arxiv

0+阅读 · 2021年6月10日

A tutorial on individualized treatment effect prediction from randomized trials with a binary endpoint

Arxiv

0+阅读 · 2021年6月10日

Data augmentation in Bayesian neural networks and the cold posterior effect

Arxiv

0+阅读 · 2021年6月10日

Gaussian Prepivoting for Finite Population Causal Inference

Arxiv

0+阅读 · 2021年6月9日

Robust Prediction Interval estimation for Gaussian Processes by Cross-Validation method

Arxiv

0+阅读 · 2021年6月9日

Gaussian Process Nowcasting: Application to COVID-19 Mortality Reporting

Arxiv

0+阅读 · 2021年6月9日

Deconditional Downscaling with Gaussian Processes

Arxiv

1+阅读 · 2021年6月5日

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Arxiv

0+阅读 · 2021年4月16日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

微信扫码咨询专知VIP会员