具有关于可变重要性的潜在先前信息的高维回归 (High-dimensional regression with potential prior information on variable importance) - 专知论文

会员服务 ·

0

估计/估计量 · INFORMS · 岭回归 · 预测器/决策函数 · 拉索回归 ·

2021 年 9 月 23 日

High-dimensional regression with potential prior information on variable importance

翻译：具有关于可变重要性的潜在先前信息的高维回归

Benjamin G. Stokell,Rajen D. Shah

from arxiv, 16 pages, 7 figures

There are a variety of settings where vague prior information may be available on the importance of predictors in high-dimensional regression settings. Examples include ordering on the variables offered by their empirical variances (which is typically discarded through standardisation), the lag of predictors when fitting autoregressive models in time series settings, or the level of missingness of the variables. Whilst such orderings may not match the true importance of variables, we argue that there is little to be lost, and potentially much to be gained, by using them. We propose a simple scheme involving fitting a sequence of models indicated by the ordering. We show that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression, and describe a strategy for Lasso regression that makes use of previous fits to greatly speed up fitting the entire sequence of models. We propose to select a final estimator by cross-validation and provide a general result on the quality of the best performing estimator on a test set selected from among a number $M$ of competing estimators in a high-dimensional linear regression setting. Our result requires no sparsity assumptions and shows that only a $\log M$ price is incurred compared to the unknown best estimator. We demonstrate the effectiveness of our approach when applied to missing or corrupted data, and time series settings. An R package is available on github.

翻译：在多种情况下,可能事先掌握关于预测器在高维回归环境下的重要性的信息模糊不清,例如:订购其实验性差异提供的变量(通常通过标准化而放弃),在时间序列设置中安装自动回归模型时预测器的滞后,或变量的缺失程度。虽然这些订单可能与变量的真正重要性不相符,但我们认为,使用这些变量几乎没有什么损失,而且可能获得很多。我们提出了一个简单方案,涉及安装由订单标明的模型序列。我们表明,在使用峰值回归时,所有模型的安装计算成本不仅限于一个适合峰值回归的公式,并描述一个拉索回归战略,利用先前的回归战略大大加快整个模型序列的匹配速度。我们提议通过交叉校验选定一个最终的估算器,并提供一个总体结果,即最佳估算器的质量来自从数个 $M 中选择的测试器。在高维度线性回归设置中,我们显示的计算成本回归的计算成本并不高,我们的结果要求以未知的价格假设或时间序列来显示我们最不确定的数据。

0

相关内容

估计/估计量

估计/估计量

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【MLSS2020】最新《深度强化学习》教程，165页ppt与视频，Mila Doina Precup

【MLSS2020】最新《深度强化学习》教程，165页ppt与视频，Mila Doina Precup

专知会员服务

68+阅读 · 2020年7月12日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

5+阅读 · 2018年1月24日

On Sparse High-Dimensional Graphical Model Learning For Dependent Time Series

Arxiv

0+阅读 · 2021年11月15日

An Approach of Bayesian Variable Selection for Ultrahigh Dimensional Multivariate Regression

Arxiv

0+阅读 · 2021年11月15日

Poisson Network Autoregression

Arxiv

0+阅读 · 2021年11月15日

Progress in Self-Certified Neural Networks

Arxiv

0+阅读 · 2021年11月15日

Single-Index Importance Sampling with Stratification

Arxiv

0+阅读 · 2021年11月15日

Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

Arxiv

0+阅读 · 2021年11月12日

Distributed Sparse Regression via Penalization

Arxiv

0+阅读 · 2021年11月12日

Near optimal sample complexity for matrix and tensor normal models via geodesic convexity

Arxiv

0+阅读 · 2021年11月11日

Modelling stochastic time delay for regression analysis

Arxiv

0+阅读 · 2021年11月11日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

估计/估计量

预测器/决策函数

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【MLSS2020】最新《深度强化学习》教程，165页ppt与视频，Mila Doina Precup

【MLSS2020】最新《深度强化学习》教程，165页ppt与视频，Mila Doina Precup

专知会员服务

68+阅读 · 2020年7月12日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

5+阅读 · 2018年1月24日

相关论文

On Sparse High-Dimensional Graphical Model Learning For Dependent Time Series

Arxiv

0+阅读 · 2021年11月15日

An Approach of Bayesian Variable Selection for Ultrahigh Dimensional Multivariate Regression

Arxiv

0+阅读 · 2021年11月15日

Poisson Network Autoregression

Arxiv

0+阅读 · 2021年11月15日

Progress in Self-Certified Neural Networks

Arxiv

0+阅读 · 2021年11月15日

Single-Index Importance Sampling with Stratification

Arxiv

0+阅读 · 2021年11月15日

Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

Arxiv

0+阅读 · 2021年11月12日

Distributed Sparse Regression via Penalization

Arxiv

0+阅读 · 2021年11月12日

Near optimal sample complexity for matrix and tensor normal models via geodesic convexity

Arxiv

0+阅读 · 2021年11月11日

Modelling stochastic time delay for regression analysis

Arxiv

0+阅读 · 2021年11月11日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员