大数据设置中通用线性模型典型强力子抽样示范方法 (A model robust sub-sampling approach for Generalised Linear Models in Big data settings) - 专知论文

会员服务 ·

0

稳健性 · MoDELS · INFORMS · 情景 · 线性模型 ·

2022 年 9 月 6 日

A model robust sub-sampling approach for Generalised Linear Models in Big data settings

翻译：大数据设置中通用线性模型典型强力子抽样示范方法

Amalan Mahendran,Helen Thompson,James M. McGree

from arxiv, Removed unnecessary space before algorithms

In today's modern era of Big data, computationally efficient and scalable methods are needed to support timely insights and informed decision making. One such method is sub-sampling, where a subset of the Big data is analysed and used as the basis for inference rather than considering the whole data set. A key question when applying sub-sampling approaches is how to select an informative subset based on the questions being asked of the data. A recent approach for this has been proposed based on determining sub-sampling probabilities for each data point, but a limitation of this approach is that appropriate sub-sampling probabilities rely on an assumed model for the Big data. In this article, to overcome this limitation, we propose a model robust approach where a set of models is considered, and the sub-sampling probabilities are evaluated based on the weighted average of probabilities that would be obtained if each model was considered singularly. Theoretical support for such an approach is provided. Our model robust sub-sampling approach is applied in a simulation study and in two real world applications where performance is compared to current sub-sampling practices. The results show that our model robust approach outperforms alternative approaches.

翻译：在当今的“大数据”时代,需要采用计算有效和可扩缩的方法来支持及时的洞察力和知情的决策。其中一种方法是子抽样,对大数据的子集进行分析,将其作为推理的基础,而不是整个数据集。在应用子抽样方法时的一个关键问题是,如何根据所询问的数据问题选择一个信息子集。为此提出了最近的办法,其依据是确定每个数据点的子抽样概率,但这一办法的一个局限性是,适当的子抽样概率依赖于大数据假设模型。为了克服这一限制,我们建议了一种强有力的模型方法,其中考虑到一组模型,根据每个模型被视作单一的概率加权平均值对子样本概率进行评估。为这种方法提供了理论支持。我们的模型强有力的子抽样方法在模拟研究中应用,在两个真实的世界应用中,将我们的业绩与目前的子模型模拟方法进行比较。结果显示,将我们的模型比照现有稳妥的方法。

0

相关内容

稳健性

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

日本囊对虾Leurelectin识别弧菌鞭毛蛋白的机制和功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于叠加训练（ST）信道估计的相干光正交频分复用系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

鲍鱼脏器多糖AHP-2的一级结构及胆囊收缩素分泌刺激活性机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

n-3多不饱和脂肪酸降低血浆同型半胱氨酸的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

超支化聚合物/硅的多维、多尺度自组装研究

国家自然科学基金

0+阅读 · 2008年12月31日

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Arxiv

1+阅读 · 2022年10月20日

Surprises in adversarially-trained linear regression

Arxiv

0+阅读 · 2022年10月20日

Dynamic selection of p-norm in linear adaptive filtering via online kernel-based reinforcement learning

Dynamic selection of p-norm in linear adaptive filtering via online kernel-based reinforcement learning

Arxiv

0+阅读 · 2022年10月20日

Double precision is not necessary for LSQR for solving discrete linear ill-posed problems

Arxiv

0+阅读 · 2022年10月20日

Adaptive greedy forward variable selection for linear regression models with incomplete data using multiple imputation

Arxiv

0+阅读 · 2022年10月20日

On Representing Mixed-Integer Linear Programs by Graph Neural Networks

Arxiv

1+阅读 · 2022年10月19日

Robust Bayesian Nonparametric Variable Selection for Linear Regression

Arxiv

0+阅读 · 2022年10月19日

ROSE: Robust Selective Fine-tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Arxiv

1+阅读 · 2022年10月20日

Surprises in adversarially-trained linear regression

Arxiv

0+阅读 · 2022年10月20日

Dynamic selection of p-norm in linear adaptive filtering via online kernel-based reinforcement learning

Dynamic selection of p-norm in linear adaptive filtering via online kernel-based reinforcement learning

Arxiv

0+阅读 · 2022年10月20日

Double precision is not necessary for LSQR for solving discrete linear ill-posed problems

Arxiv

0+阅读 · 2022年10月20日

Adaptive greedy forward variable selection for linear regression models with incomplete data using multiple imputation

Arxiv

0+阅读 · 2022年10月20日

On Representing Mixed-Integer Linear Programs by Graph Neural Networks

Arxiv

1+阅读 · 2022年10月19日

Robust Bayesian Nonparametric Variable Selection for Linear Regression

Arxiv

0+阅读 · 2022年10月19日

ROSE: Robust Selective Fine-tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

相关基金

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

日本囊对虾Leurelectin识别弧菌鞭毛蛋白的机制和功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于叠加训练（ST）信道估计的相干光正交频分复用系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

鲍鱼脏器多糖AHP-2的一级结构及胆囊收缩素分泌刺激活性机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

n-3多不饱和脂肪酸降低血浆同型半胱氨酸的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

超支化聚合物/硅的多维、多尺度自组装研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员