Accurately estimating the proportion of signals hidden in a large amount of noise variables is of interest in many scientific inquires. In this paper, we consider realistic but theoretically challenging settings with arbitrary covariance dependence between variables. We define mean absolute correlation (MAC) to measure the overall dependence strength and investigate a family of estimators for their performances in the full range of MAC. We explicit the joint effect of MAC and signal sparsity on the performances of the family of estimators and discover that the most powerful estimator under independence is no longer most effective when the MAC dependence is strong enough. Motivated by the theoretical insight, we propose a new estimator to better adapt to arbitrary covariance dependence. The proposed method compares favorably to several existing methods in extensive finite-sample settings with strong to weak covariance dependence and real dependence structures from genetic association studies.
翻译:精确估计大量噪音变量中隐藏的信号比例是许多科学调查所感兴趣的。在本文中,我们认为现实但理论上具有挑战性的环境,各变量之间任意的共生依赖性。我们定义绝对相关(MAC),以衡量总体依赖性强,并调查一个估计者家庭在MAC全范围内的性能。我们明确了MAC和信号宽度对估计者家庭的表现的共同影响,发现独立下最强大的估算者在MAC依赖性足够强的情况下不再最为有效。我们根据理论的洞察,提出了一个新的估算者,以更好地适应任意共生依赖性。拟议方法优于大量有限分布环境中的多种现有方法,这些方法具有很强或薄弱的共生依赖性,以及遗传协会研究中的真正依赖性结构。