计算性探索变值重要估算的新兴方法 (A Computational Exploration of Emerging Methods of Variable Importance Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · Performer · 支持向量机 · xgboost · 冗余特征 ·

2022 年 8 月 5 日

A Computational Exploration of Emerging Methods of Variable Importance Estimation

翻译：计算性探索变值重要估算的新兴方法

Louis Mozart Kamdem,Ernest Fokoue

Estimating the importance of variables is an essential task in modern machine learning. This help to evaluate the goodness of a feature in a given model. Several techniques for estimating the importance of variables have been developed during the last decade. In this paper, we proposed a computational and theoretical exploration of the emerging methods of variable importance estimation, namely: Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine (SVM), the Predictive Error Function (PERF), Random Forest (RF), and Extreme Gradient Boosting (XGBOOST) that were tested on different kinds of real-life and simulated data. All these methods can handle both regression and classification tasks seamlessly but all fail when it comes to dealing with data containing missing values. The implementation has shown that PERF has the best performance in the case of highly correlated data closely followed by RF. PERF and XGBOOST are "data-hungry" methods, they had the worst performance on small data sizes but they are the fastest when it comes to the execution time. SVM is the most appropriate when many redundant features are in the dataset. A surplus with the PERF is its natural cut-off at zero helping to separate positive and negative scores with all positive scores indicating essential and significant features while the negatives score indicates useless features. RF and LASSO are very versatile in a way that they can be used in almost all situations despite they are not giving the best results.

翻译：估计变量的重要性是现代机器学习中的一项基本任务。这有助于评估特定模型中某个特征的优劣性。过去十年中开发了几种估算变量重要性的技术。在本文中, 我们提议对新出现的不同重要性估计方法进行计算和理论探索, 即: 最不绝对缩小和选择操作员(LASSO)、支持矢量机(SVM)、预测错误函数(PERF)、随机森林(Random Forest) 和极快加速(XGBOOST) (XGBOST) 。所有这些方法都可以无缝地处理回归和分类任务,但在处理含有缺失值的数据时,所有这些方法都失败了。执行结果表明,在数据设置中,最密切相关的数据功能是“ 绝对缩小” (PERF) 和 XGBOOOST是“数据饥饿” 方法, 其性能最差的功能是小数据大小, 但是当时间到执行时,它们表现得最快。当许多冗余的功能在数据设置中几乎是无用的, 但处理缺损性成绩的成绩特征时, 而自然分分分数显示, 。顺差与分数与分数(PERF) 表示为正分数是最坏的分数是最坏的。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

非球形降水粒子谱测量方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

MiR-139-5p通过调控Rho/ROCK信号通路参与高血压心肌重塑

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Drimenal的新型杀菌剂分子设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土簇及稀土与过渡金属簇-有机骨架的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

区域农业干旱形成机理及预警预报方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化亚铜单晶的生长、半导体性能调控和太阳能光伏器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

手性有机多孔材料：“Bottom-Up”策略实现手性有机小分子催化剂的多相化

国家自然科学基金

0+阅读 · 2011年12月31日

由Janus胶束构筑具有不对称结构的金属-金属氧化物纳米粒子

国家自然科学基金

0+阅读 · 2011年12月31日

A Review of Multilingualism in and for Ontologies

Arxiv

0+阅读 · 2022年10月6日

On the Use of Deep Learning in Software Defect Prediction

Arxiv

0+阅读 · 2022年10月5日

Nearest Neighbor Classifier with Margin Penalty for Active Learning

Arxiv

0+阅读 · 2022年10月5日

Robust Estimation of Loss-Based Measures of Model Performance under Covariate Shift

Arxiv

0+阅读 · 2022年10月5日

The DAG Visit approach for Pebbling and I/O Lower Bounds

Arxiv

0+阅读 · 2022年10月4日

Estimating the hardness of SAT encodings for Logical Equivalence Checking of Boolean circuits

Arxiv

0+阅读 · 2022年10月4日

Robust Prediction Error Estimation with Monte-Carlo Methodology

Arxiv

0+阅读 · 2022年10月2日

Model error and its estimation, with particular application to loss reserving

Arxiv

0+阅读 · 2022年9月30日

Generalized Multi-Relational Graph Convolution Network

Arxiv

10+阅读 · 2020年6月12日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

VIP会员

文章信息

相关主题

估计/估计量

支持向量机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Review of Multilingualism in and for Ontologies

Arxiv

0+阅读 · 2022年10月6日

On the Use of Deep Learning in Software Defect Prediction

Arxiv

0+阅读 · 2022年10月5日

Nearest Neighbor Classifier with Margin Penalty for Active Learning

Arxiv

0+阅读 · 2022年10月5日

Robust Estimation of Loss-Based Measures of Model Performance under Covariate Shift

Arxiv

0+阅读 · 2022年10月5日

The DAG Visit approach for Pebbling and I/O Lower Bounds

Arxiv

0+阅读 · 2022年10月4日

Estimating the hardness of SAT encodings for Logical Equivalence Checking of Boolean circuits

Arxiv

0+阅读 · 2022年10月4日

Robust Prediction Error Estimation with Monte-Carlo Methodology

Arxiv

0+阅读 · 2022年10月2日

Model error and its estimation, with particular application to loss reserving

Arxiv

0+阅读 · 2022年9月30日

Generalized Multi-Relational Graph Convolution Network

Arxiv

10+阅读 · 2020年6月12日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

相关基金

非球形降水粒子谱测量方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

MiR-139-5p通过调控Rho/ROCK信号通路参与高血压心肌重塑

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Drimenal的新型杀菌剂分子设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土簇及稀土与过渡金属簇-有机骨架的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

区域农业干旱形成机理及预警预报方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化亚铜单晶的生长、半导体性能调控和太阳能光伏器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

手性有机多孔材料：“Bottom-Up”策略实现手性有机小分子催化剂的多相化

国家自然科学基金

0+阅读 · 2011年12月31日

由Janus胶束构筑具有不对称结构的金属-金属氧化物纳米粒子

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员