线性关系关联性测试的预测数据校准 (Predictive Data Calibration for Linear Correlation Significance Testing) - 专知论文

会员服务 ·

0

相关系数 · 线性的 · 后验分布 · 线性相关 · 估计/估计量 ·

2022 年 8 月 15 日

Predictive Data Calibration for Linear Correlation Significance Testing

翻译：线性关系关联性测试的预测数据校准

Kaustubh R. Patil,Simon B. Eickhoff,Robert Langner

Inferring linear relationships lies at the heart of many empirical investigations. A measure of linear dependence should correctly evaluate the strength of the relationship as well as qualify whether it is meaningful for the population. Pearson's correlation coefficient (PCC), the \textit{de-facto} measure for bivariate relationships, is known to lack in both regards. The estimated strength $r$ maybe wrong due to limited sample size, and nonnormality of data. In the context of statistical significance testing, erroneous interpretation of a $p$-value as posterior probability leads to Type I errors -- a general issue with significance testing that extends to PCC. Such errors are exacerbated when testing multiple hypotheses simultaneously. To tackle these issues, we propose a machine-learning-based predictive data calibration method which essentially conditions the data samples on the expected linear relationship. Calculating PCC using calibrated data yields a calibrated $p$-value that can be interpreted as posterior probability together with a calibrated $r$ estimate, a desired outcome not provided by other methods. Furthermore, the ensuing independent interpretation of each test might eliminate the need for multiple testing correction. We provide empirical evidence favouring the proposed method using several simulations and application to real-world data.

翻译：推断线性关系是许多经验性调查的核心。线性依赖度的尺度应该正确评估这种关系的强度,并证明这种关系是否对人口有意义。 Pearson的双轨关系相关系数(PCC),即双轨关系中的基于机器学习的预测数据校准方法(PCC),在这两个方面已知都缺乏。估计的强度美元可能是错误的,因为抽样规模有限,数据不具有正常性。在统计意义测试方面,错误地解释美元价值作为后继概率导致I类错误 -- -- 这个问题具有重大意义的测试,延伸至PCC。在同时测试多个假设时,这种错误会加剧。为了解决这些问题,我们建议一种基于机器学习的预测数据校准方法,基本上为预期线性关系的数据样本提供条件。使用校准数据计算得出的校准美元价值,可以被解释为事后概率,加上校准的美元估计数,这是其他方法没有提供的预期结果。此外,随后对每次测试进行的独立解释可能会消除对多重测试的需要。我们提供实验性证据, 使用模拟方法来模拟模拟数据。

0

相关内容

相关系数

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最浅显的奇异值分解(SVD)介绍，《Singular Value Decomposition as Simply as Possible》

最浅显的奇异值分解(SVD)介绍，《Singular Value Decomposition as Simply as Possible》

专知会员服务

12+阅读 · 2022年3月14日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

NEDD9小分子抑制剂的筛选及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性算子零点的分裂算法：收敛性分析与应用

国家自然科学基金

0+阅读 · 2013年12月31日

新变指标Besov-Triebel-Lizorkin型函数空间及算子有界性

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

流体动力学领域中若干具有奇异性的数学模型

国家自然科学基金

0+阅读 · 2012年12月31日

胆固醇在食品加工中的氧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MRI动态监测小肠缺血再灌注损伤肠上皮细胞内Ca2+变化的实验研究

国家自然科学基金

0+阅读 · 2010年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

Arxiv

0+阅读 · 2022年10月6日

Maximum independent set (stable set) problem: A mathematical programming model with valid inequalities; Computational testing with binary search and alternate optimal basic solutions (extreme points)

Arxiv

0+阅读 · 2022年10月6日

Solar Flare Index Prediction Using SDO/HMI Vector Magnetic Data Products with Statistical and Machine Learning Methods

Arxiv

0+阅读 · 2022年10月6日

Beyond Impute-Then-Regress: Adapting Prediction to Missing Data

Arxiv

0+阅读 · 2022年10月5日

Prototypical Calibration for Few-shot Learning of Language Models

Arxiv

0+阅读 · 2022年10月5日

The Calibration Generalization Gap

Arxiv

0+阅读 · 2022年10月5日

Inference on High-dimensional Single-index Models with Streaming Data

Arxiv

0+阅读 · 2022年10月3日

Inferring Manifolds From Noisy Data Using Gaussian Processes

Arxiv

0+阅读 · 2022年10月2日

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

Arxiv

0+阅读 · 2022年9月29日

Learning an Efficient Terrain Representation for Haptic Localization of a Legged Robot

Arxiv

0+阅读 · 2022年9月29日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最浅显的奇异值分解(SVD)介绍，《Singular Value Decomposition as Simply as Possible》

最浅显的奇异值分解(SVD)介绍，《Singular Value Decomposition as Simply as Possible》

专知会员服务

12+阅读 · 2022年3月14日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

Arxiv

0+阅读 · 2022年10月6日

Maximum independent set (stable set) problem: A mathematical programming model with valid inequalities; Computational testing with binary search and alternate optimal basic solutions (extreme points)

Arxiv

0+阅读 · 2022年10月6日

Solar Flare Index Prediction Using SDO/HMI Vector Magnetic Data Products with Statistical and Machine Learning Methods

Arxiv

0+阅读 · 2022年10月6日

Beyond Impute-Then-Regress: Adapting Prediction to Missing Data

Arxiv

0+阅读 · 2022年10月5日

Prototypical Calibration for Few-shot Learning of Language Models

Arxiv

0+阅读 · 2022年10月5日

The Calibration Generalization Gap

Arxiv

0+阅读 · 2022年10月5日

Inference on High-dimensional Single-index Models with Streaming Data

Arxiv

0+阅读 · 2022年10月3日

Inferring Manifolds From Noisy Data Using Gaussian Processes

Arxiv

0+阅读 · 2022年10月2日

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

Arxiv

0+阅读 · 2022年9月29日

Learning an Efficient Terrain Representation for Haptic Localization of a Legged Robot

Arxiv

0+阅读 · 2022年9月29日

相关基金

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

NEDD9小分子抑制剂的筛选及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性算子零点的分裂算法：收敛性分析与应用

国家自然科学基金

0+阅读 · 2013年12月31日

新变指标Besov-Triebel-Lizorkin型函数空间及算子有界性

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

流体动力学领域中若干具有奇异性的数学模型

国家自然科学基金

0+阅读 · 2012年12月31日

胆固醇在食品加工中的氧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MRI动态监测小肠缺血再灌注损伤肠上皮细胞内Ca2+变化的实验研究

国家自然科学基金

0+阅读 · 2010年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员