终结- Knockoff 过滤器: 带有假发现率控制的快速高维变量选择 (The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control) - 专知论文

会员服务 ·

0

控制器 · FAST · CC · 预测器/决策函数 · state-of-the-art ·

2022 年 2 月 3 日

The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control

翻译：终结- Knockoff 过滤器: 带有假发现率控制的快速高维变量选择

Jasin Machkour,Michael Muma,Daniel P. Palomar

from arxiv, 29 pages, 13 figures, 2 tables

We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original predictors and multiple sets of randomly generated knockoff predictors. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate probability distribution with existing finite expectation and variance. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods. An open source R package containing the implementation of the T-Knock filter is available at https://github.com/jasinmachkour/tknock.

翻译：我们提议了终止- Knockoff (T- Knock) 过滤器, 这是一种用于高维数据的快速变量选择方法。 T- Knock 过滤器控制了一个用户定义的目标错误发现率(FDR), 并同时将选定的变量数量最大化。实现这个目标的方法是使用多个早期终止随机实验的解决方案。实验是在原始预测器和多组随机生成的击落预测器的组合下进行的。提供了基于FDR 控制属性的马丁格尔理论的有限样本证明。数值模拟显示FDR控制在目标级别上,同时允许高功率。我们证明, 在温和的条件下, 击出的目标目标目标目标虚假发现率(FDR) 虚假的假发现率发现率(FDR) 虚假的错误发现率分布可以与现有的有限期望和差异最大化。拟议方法的计算复杂度是计算方法的,并且通过数字模拟显示, 连续计算时间时间比稀薄高的高维度环境中的FDR- K 系统/ 最强的基质控制方法要高得多。在模拟的MDR- groom- groom- groom- bal- bal- bal 的测试中, 最短的系统/ broup- bal- broom- bism- bism- be be be be be be be be be be be mess lax is be be be be be be mess mess lax be be be be be lax lax lax lax lax lax lax lax lax lax laxis

0

相关内容

控制器

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

5G网络安全标准化白皮书, 53页pdf

5G网络安全标准化白皮书, 53页pdf

专知会员服务

67+阅读 · 2021年5月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

专知会员服务

25+阅读 · 2020年7月28日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机与动态环境下物流配送区域划分与配送路径集成优化问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多尺度各向异性方向导数核的图象角点检测和分类理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

对偶框架各向异性提升变换理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

算子和交换子的理论及其应用

国家自然科学基金

1+阅读 · 2011年12月31日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Arxiv

0+阅读 · 2022年4月18日

Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems

Arxiv

0+阅读 · 2022年4月18日

Randomized Maximum Likelihood via High-Dimensional Bayesian Optimization

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

A general framework for identification of permissible variable subsets and development of structured variable selection methods

Arxiv

0+阅读 · 2022年4月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

预测器/决策函数

state-of-the-art

相关VIP内容

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

5G网络安全标准化白皮书, 53页pdf

5G网络安全标准化白皮书, 53页pdf

专知会员服务

67+阅读 · 2021年5月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

专知会员服务

25+阅读 · 2020年7月28日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Arxiv

0+阅读 · 2022年4月18日

Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems

Arxiv

0+阅读 · 2022年4月18日

Randomized Maximum Likelihood via High-Dimensional Bayesian Optimization

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

A general framework for identification of permissible variable subsets and development of structured variable selection methods

Arxiv

0+阅读 · 2022年4月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机与动态环境下物流配送区域划分与配送路径集成优化问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多尺度各向异性方向导数核的图象角点检测和分类理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

对偶框架各向异性提升变换理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

算子和交换子的理论及其应用

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员