如何评估机器学习中的不确定性估计以利倒退? (How to Evaluate Uncertainty Estimates in Machine Learning for Regression?) - 专知论文

会员服务 ·

0

估计/估计量 · Machine Learning · 学成 · 置信度 · Neural Networks ·

2021 年 6 月 7 日

How to Evaluate Uncertainty Estimates in Machine Learning for Regression?

翻译：如何评估机器学习中的不确定性估计以利倒退?

Laurens Sluijterman,Eric Cator,Tom Heskes

from arxiv, 22 pages, 10 figures

As neural networks become more popular, the need for accompanying uncertainty estimates increases. The current testing methodology focusses on how good the predictive uncertainty estimates explain the differences between predictions and observations in a previously unseen test set. Intuitively this is a logical approach. The current setup of benchmark data sets also allows easy comparison between the different methods. We demonstrate, however, through both theoretical arguments and simulations that this way of evaluating the quality of uncertainty estimates has serious flaws. Firstly, it cannot disentangle the aleatoric from the epistemic uncertainty. Secondly, the current methodology considers the uncertainty averaged over all test samples, implicitly averaging out overconfident and underconfident predictions. When checking if the correct fraction of test points falls inside prediction intervals, a good score on average gives no guarantee that the intervals are sensible for individual points. We demonstrate through practical examples that these effects can result in favoring a method, based on the predictive uncertainty, that has undesirable behaviour of the confidence intervals. Finally, we propose a simulation-based testing approach that addresses these problems while still allowing easy comparison between different methods.

翻译：随着神经网络越来越普遍,随之而来的不确定性估计需要增加。当前的测试方法侧重于预测性不确定性估计如何很好地解释先前看不见的测试集中的预测和观察之间的差异。直观地说,这是一种逻辑的方法。目前的基准数据集的设置也容易地比较不同的方法。然而,我们通过理论论和模拟表明,这种评估不确定性估计质量的方法存在严重缺陷。首先,它不能将疏导的解析与共认性不确定性分解开来。第二,目前的方法考虑了所有测试样本的不确定性平均值,隐含地平均过自信和不自信的预测。在检查测试点的正确部分是否在预测间隔内时,平均的得分并不能保证每个点的间隔合理。我们通过实际例子表明,这些效应能够有利于一种基于预测性不确定性的方法,从而产生信任期的不良行为。最后,我们建议一种基于模拟的测试方法,既能解决这些问题,又能方便不同方法之间的比较。

0

相关内容

估计/估计量

估计/估计量

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Arxiv

0+阅读 · 2021年7月28日

New Metrics to Evaluate the Performance and Fairness of Personalized Federated Learning

Arxiv

0+阅读 · 2021年7月28日

Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers

Arxiv

0+阅读 · 2021年7月28日

A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods

Arxiv

0+阅读 · 2021年7月26日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation

Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation

Arxiv

4+阅读 · 2018年7月19日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

VIP会员

文章信息

相关主题

估计/估计量

Machine Learning

Neural Networks

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复合人工智能决策优势：面向军事行动的人类数字孪生智能体编队与群体建模》最新文献

中文版《整合蓝绿作战域：北约空陆一体化向多域作战演进》2025最新资料

演进中的空中力量指挥控制体系

《在轨空间目标多智能体检测的制导、导航与控制》195页

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Arxiv

0+阅读 · 2021年7月28日

New Metrics to Evaluate the Performance and Fairness of Personalized Federated Learning

Arxiv

0+阅读 · 2021年7月28日

Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers

Arxiv

0+阅读 · 2021年7月28日

A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods

Arxiv

0+阅读 · 2021年7月26日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation

Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation

Arxiv

4+阅读 · 2018年7月19日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

微信扫码咨询专知VIP会员