We propose a general test of conditional independence. The conditional predictive impact (CPI) is a provably consistent and unbiased estimator of one or several features' association with a given outcome, conditional on a (potentially empty) reduced feature set. Building on the knockoff framework of Cand\`es et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for low- or high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying graph structures for multivariate systems. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of realand simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability with greater power and speed than the original knockoff filter. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.
翻译:有条件的预测效果(CPI)是一个可以被证实的一致和不偏不倚的、与某种结果关联的一种或多种特征的估算标准,其条件是(可能空的)降低的功能集。在Cand ⁇ es等人(2018年)的入门框架基础上,我们开发了一个创新的测试程序,与任何有效的入门取样器、受监督的学习算法和损失功能一起工作。CIPI可以有效地计算出低或高维数据,而没有任何孔径限制。我们展示了CPI的趋同标准,并制定了评估其规模、重要性和精确度的统计推导程序。这些测试在功能和模型选择方面的帮助,将传统的常客和拜斯人技术推广到一般监管的学习任务。CPI也可以用于因果发现,确定多变系统的基本图表结构。我们用各种算法,包括线性回归、神经网络、随机森林以及支持矢量机器来测试我们的方法。Empricalalalalalalalalalalalal 的结果显示,相对于替代的可变重要性措施和其他非参数测试性判断性测试性测试程序。在我们真实和图像级的精确度的精确度上,可以确认我们真实和图像的精确度的精确度的精确度的精确度。