\Multiple imputation (MI) is a popular and well-established method for handling missing data in multivariate data sets, but its practicality for use in massive and complex data sets has been questioned. One such data set is the Panel Study of Income Dynamics (PSID), a longstanding and extensive survey of household income and wealth in the U.S. Missing data for this survey are currently handled using traditional hot deck methods because of the simple implementation; however, the univariate hot deck results in large random wealth fluctuations. MI is effective but faced with operational challenges. We use a sequential regression/ chained-equation approach, using the software IVEware, to multiply impute cross-sectional wealth data in the 2013 PSID, and compare analyses of the resulting imputed data with those from the current hot deck approach. Practical difficulties, such as non-normally distributed variables, skip patterns, categorical variables with many levels, and multicollinearity, are described together with our approaches to overcoming them. We evaluate the imputation quality and validity with internal diagnostics and external benchmarking data. MI produces improvements over the existing hot deck approach by helping preserve correlation structures, such as the associations between PSID wealth components and the relationships between the household net worth and socio-demographic factors, and facilitates completed data analyses with general purposes. MI incorporates highly predictive covariates into imputation models and increases efficiency. We recommend the practical implementation of MI and expect greater gains when the fraction of missing information is large.
翻译:\ Multiple Guitation (MI) 是处理多变数据集中缺失的数据的流行和既定方法,但是在大规模和复杂的数据集中,我们使用该方法的实用性受到质疑,其中一个数据集是收入动态小组研究(PSID),这是对美国家庭收入和财富的长期和广泛调查。 本次调查的缺失数据目前使用传统的热甲板方法处理,因为实施过程简单;然而,单向热甲板导致大量财富波动。MI是有效的,但面临操作挑战。我们使用软件IVEware, 采用连续回归/链式对齐方法,在2013年的PSID中增加跨部门财富数据,并将由此得出的估算数据与当前热甲板方法中的数据进行比较。 实际困难,如非正常分布变量、跳动模式、许多层次的绝对变量以及多线性,与我们克服这些变量的方法一起描述。 我们评估内部诊断和外部基准数据的质量与有效性。我们利用IVEware软件, 来增加2013年PSID系统软件的透视界数据数据,从而改进了现有粗略的跨部门财富预测和深度分析,从而保持了现有快速分析,从而改进了当前统计分析,从而将改进了现有快速分析,从而将改进了现有快速分析结果和深度分析,将改进了当前数据与深度分析,将改进了当前数据与深度分析,将数据与深度分析,从而将改进了当前数据与深度分析,将改进了当前结构的进度值纳入。