We wish to test whether a real-valued variable $Z$ has explanatory power, in addition to a multivariate variable $X$, for a binary variable $Y$. Thus, we are interested in testing the hypothesis $\mathbb{P}(Y=1\, | \, X,Z)=\mathbb{P}(Y=1\, | \, X)$, based on $n$ i.i.d.\ copies of $(X,Y,Z)$. In order to avoid the curse of dimensionality, we follow the common approach of assuming that the dependence of both $Y$ and $Z$ on $X$ is through a single-index $X^\topβ$ only. Splitting the sample on both $Y$-values, we construct a two-sample empirical process of transformed $Z$-variables, after splitting the $X$-space into parallel strips. Studying this two-sample empirical process is challenging: it does not converge weakly to a standard Brownian bridge, but after an appropriate normalization it does. We use this result to construct distribution-free tests.
翻译:我们希望检验一个实值变量$Z$是否对二元变量$Y$具有解释能力,该能力需在多元变量$X$之外单独考量。因此,我们关注基于$n$个独立同分布的$(X,Y,Z)$样本,检验假设$\mathbb{P}(Y=1\, | \, X,Z)=\mathbb{P}(Y=1\, | \, X)$。为规避维度灾难,我们遵循常见方法,假设$Y$和$Z$对$X$的依赖仅通过单指标$X^\topβ$实现。通过按$Y$取值分割样本,并将$X$空间划分为平行条带,我们构建了一个基于变换后$Z$变量的双样本经验过程。研究该双样本经验过程具有挑战性:它不会弱收敛于标准布朗桥,但经过适当归一化后则可实现收敛。我们利用这一结果构建了无分布检验。