带有决定键的优化微缩恢复</s> (Optimal Sparse Recovery with Decision Stumps) - 专知论文

会员服务 ·

0

决策树桩 · 优化器 · 特征选择 · Analysis · 稀疏 ·

2023 年 3 月 8 日

Optimal Sparse Recovery with Decision Stumps

翻译：带有决定键的优化微缩恢复

Kiarash Banihashem,MohammadTaghi Hajiaghayi,Max Springer

from arxiv, Accepted to AAAI 2023

Decision trees are widely used for their low computational cost, good predictive performance, and ability to assess the importance of features. Though often used in practice for feature selection, the theoretical guarantees of these methods are not well understood. We here obtain a tight finite sample bound for the feature selection problem in linear regression using single-depth decision trees. We examine the statistical properties of these "decision stumps" for the recovery of the $s$ active features from $p$ total features, where $s \ll p$. Our analysis provides tight sample performance guarantees on high-dimensional sparse systems which align with the finite sample bound of $O(s \log p)$ as obtained by Lasso, improving upon previous bounds for both the median and optimal splitting criteria. Our results extend to the non-linear regime as well as arbitrary sub-Gaussian distributions, demonstrating that tree based methods attain strong feature selection properties under a wide variety of settings and further shedding light on the success of these methods in practice. As a byproduct of our analysis, we show that we can provably guarantee recovery even when the number of active features $s$ is unknown. We further validate our theoretical results and proof methodology using computational experiments.

翻译：决策树被广泛用于低计算成本、良好的预测性能以及评估特征重要性的能力。虽然这些方法的理论保障通常用于选择特征,但人们并不十分理解这些方法的理论保障。我们在这里获得一个严格有限的样本,用于利用单深度决定性树进行线性回归的特征选择问题。我们对这些“决定立木”的统计特性进行了检查,以便从美元总额(美元=美元=美元=美元)中恢复美元活动特征。我们的分析为高维稀有系统提供了严格的样本性能保障,这些系统与Lasso获得的美元(美元=log p)的有限样本结合一致,改进了中位和最佳分裂标准以前的界限。我们的结果延伸至非线性制度以及任意的亚高加索地区分布,表明基于树木的方法在各种环境中取得了很强的特征选择属性,并进一步揭示了这些方法在实践中的成功。作为我们分析的一个副产品,我们表明,即使积极性要素(美元)的数量是未知的,我们也可以保证回收。我们进一步验证了我们的理论结果和实验方法。</s>

0

相关内容

决策树桩

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

大头金蝇(Chrysomya megacephala) 肠道内生物转化餐厨垃圾的分子生态学解析

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-10a/b靶向调控ABCA1和ABCG1对胆固醇流出的影响

国家自然科学基金

0+阅读 · 2013年12月31日

超临界二氧化碳中氟烯烃单体RAFT可控聚合及其制备氟烯烃聚合物微球的研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类新颖结构的链霉菌源Vicenistations类抗肿瘤成分研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化磁性纳米颗粒/铁电聚合物复合微球的微流方法可控制备及改性研究

国家自然科学基金

0+阅读 · 2012年12月31日

(In,Ga)2Te3一维纳米结构及其核壳复合材料的可控制备与光电性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚乳酸-环糊精包合物的可控制备、结晶行为及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

小菜蛾精氨酸激酶基因的克隆及其RNAi研究

国家自然科学基金

0+阅读 · 2009年12月31日

Robust Social Welfare Maximization via Information Design in Linear-Quadratic-Gaussian Games

Arxiv

0+阅读 · 2023年4月28日

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Arxiv

0+阅读 · 2023年4月28日

Optimal partition of feature using Bayesian classifier

Arxiv

0+阅读 · 2023年4月27日

Dynamic Pricing and Learning with Bayesian Persuasion

Arxiv

0+阅读 · 2023年4月27日

Leveraging sparse and shared feature activations for disentangled representation learning

Arxiv

0+阅读 · 2023年4月27日

The out-of-sample prediction error of the square-root-LASSO and related estimators

Arxiv

0+阅读 · 2023年4月27日

Functional Data Representation with Merge Trees

Arxiv

0+阅读 · 2023年4月26日

DECONET: an Unfolding Network for Analysis-based Compressed Sensing with Generalization Error Bounds

Arxiv

0+阅读 · 2023年4月26日

Inferring networks from time series: a neural approach

Arxiv

0+阅读 · 2023年4月26日

Entropy-based convergence rates of greedy algorithms

Arxiv

0+阅读 · 2023年4月26日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】基础模型后训练的新方法

欧盟防务准备路线图：目标、冲突与2030之路（附“2030年防务准备路线图”原文）

【AAAI2026】模型不确定性下的在线鲁棒规划：一种基于采样的方法

Transformers 出现以来关系抽取任务的系统综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Robust Social Welfare Maximization via Information Design in Linear-Quadratic-Gaussian Games

Arxiv

0+阅读 · 2023年4月28日

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Arxiv

0+阅读 · 2023年4月28日

Optimal partition of feature using Bayesian classifier

Arxiv

0+阅读 · 2023年4月27日

Dynamic Pricing and Learning with Bayesian Persuasion

Arxiv

0+阅读 · 2023年4月27日

Leveraging sparse and shared feature activations for disentangled representation learning

Arxiv

0+阅读 · 2023年4月27日

The out-of-sample prediction error of the square-root-LASSO and related estimators

Arxiv

0+阅读 · 2023年4月27日

Functional Data Representation with Merge Trees

Arxiv

0+阅读 · 2023年4月26日

DECONET: an Unfolding Network for Analysis-based Compressed Sensing with Generalization Error Bounds

Arxiv

0+阅读 · 2023年4月26日

Inferring networks from time series: a neural approach

Arxiv

0+阅读 · 2023年4月26日

Entropy-based convergence rates of greedy algorithms

Arxiv

0+阅读 · 2023年4月26日

相关基金

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

大头金蝇(Chrysomya megacephala) 肠道内生物转化餐厨垃圾的分子生态学解析

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-10a/b靶向调控ABCA1和ABCG1对胆固醇流出的影响

国家自然科学基金

0+阅读 · 2013年12月31日

超临界二氧化碳中氟烯烃单体RAFT可控聚合及其制备氟烯烃聚合物微球的研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类新颖结构的链霉菌源Vicenistations类抗肿瘤成分研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化磁性纳米颗粒/铁电聚合物复合微球的微流方法可控制备及改性研究

国家自然科学基金

0+阅读 · 2012年12月31日

(In,Ga)2Te3一维纳米结构及其核壳复合材料的可控制备与光电性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚乳酸-环糊精包合物的可控制备、结晶行为及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

小菜蛾精氨酸激酶基因的克隆及其RNAi研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员