In this article we explore the data available through the Stanford Open Policing Project. The data consist of information on millions of traffic stops across close to 100 different cities and highway patrols. Using a variety of metrics, we identify that the data is not missing completely at random. Furthermore, we develop ways of quantifying and visualizing missingness trends for different variables across the datasets. We follow up by performing a sensitivity analysis to extend work done on the outcome test as well as to extend work done on sharp bounds on the average treatment effect. We demonstrate that bias calculations can fundamentally shift depending on the assumptions made about the observations for which the race variable has not been recorded. We suggest ways that our missingness sensitivity analysis can be extended to myriad different contexts.
翻译:本文基于斯坦福开放警务项目提供的数据展开研究。该数据集涵盖了近百个城市及公路巡逻队数百万次交通拦截记录。通过多种度量指标,我们发现数据并非完全随机缺失。进一步地,我们开发了量化与可视化方法,用于呈现不同变量在数据集中的缺失趋势。随后通过敏感性分析,拓展了结果检验法的既有研究,并延伸了平均处理效应尖锐界限的相关工作。研究表明:针对种族变量未记录观测值的不同假设,会导致偏差计算发生根本性变化。最后,我们提出了将缺失性敏感性分析方法拓展应用于多元场景的可行路径。