神经网络训练中的方差：波动是无害的且不可避免的 (Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable) - 专知论文

会员服务 ·

0

方差 · 测试集 · 神经网络训练 · 波动 · 神经网络 ·

2023 年 4 月 4 日

Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

翻译：神经网络训练中的方差：波动是无害的且不可避免的

Typical neural network trainings have substantial variance in test-set performance between repeated runs, impeding hyperparameter comparison and training reproducibility. We present the following results towards understanding this variation. (1) Despite having significant variance on their test-sets, we demonstrate that standard CIFAR-10 and ImageNet trainings have very little variance in their performance on the test-distributions from which those test-sets are sampled, suggesting that variance is less of a practical issue than previously thought. (2) We present a simplifying statistical assumption which closely approximates the structure of the test-set accuracy distribution. (3) We argue that test-set variance is inevitable in the following two senses. First, we show that variance is largely caused by high sensitivity of the training process to initial conditions, rather than by specific sources of randomness like the data order and augmentations. Second, we prove that variance is unavoidable given the observation that ensembles of trained networks are well-calibrated. (4) We conduct preliminary studies of distribution-shift, fine-tuning, data augmentation and learning rate through the lens of variance between runs.

翻译：典型的神经网络训练在不同运行之间具有大量的测试集表现方差，这影响了超参数比较和训练可重复性。本文研究以下结果，以便更好地理解这种差异。首先，尽管在测试集上有显着差异，但我们证明标准的CIFAR-10和ImageNet训练在它们从中抽取测试集的测试分布上的表现几乎没有差异，这表明方差不如先前想象中实用。其次，我们提出了一个简化的统计假设，它很好地近似了测试集准确度分布的结构。第三，我们认为测试集方差在以下两个意义上是不可避免的。首先，我们展示了方差主要是由于训练过程对初始条件高度敏感而引起的，而不是由于数据顺序和增强等特定的随机源。其次，我们证明了在观察到训练的网络集是良性校准的情况下，方差是不可避免的。最后，我们通过方差在不同运行之间进行初步的分布变化、微调、数据增强和学习速率的研究。

0

相关内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

深度学习医学图像分析文献集

深度学习医学图像分析文献集

机器学习研究会

19+阅读 · 2017年10月13日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

ERG介导组蛋白修饰调控SLP2促进EMT在前列腺癌转移中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

毛囊干细胞microRNA调控毛乳头细胞诱导毛囊再生能力的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有Markov跳变参数的随机混合拟哈密顿系统的动力学与控制

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

血管平滑肌细胞表型相关lncRNA的筛查及其在转录调控中与核内microRNA的互作效应

国家自然科学基金

0+阅读 · 2012年12月31日

以信息为中心的网络（ICN）缓存机制性能评估与算法优化

国家自然科学基金

0+阅读 · 2012年12月31日

概率并发理论

国家自然科学基金

1+阅读 · 2011年12月31日

负浮力射流的卷积和失稳特性及其湍流形成机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

Understanding the Risks and Rewards of Combining Unbiased and Possibly Biased Estimators, with Applications to Causal Inference

Arxiv

0+阅读 · 2023年5月24日

How precise are performance estimates for typical medical image segmentation tasks?

Arxiv

0+阅读 · 2023年5月24日

Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

Arxiv

0+阅读 · 2023年5月23日

Decoupled Kullback-Leibler Divergence Loss

Arxiv

0+阅读 · 2023年5月23日

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

Arxiv

0+阅读 · 2023年5月23日

Federated Variational Inference: Towards Improved Personalization and Generalization

Arxiv

0+阅读 · 2023年5月23日

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

Arxiv

0+阅读 · 2023年5月19日

SFP: Spurious Feature-targeted Pruning for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年5月19日

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning

Arxiv

0+阅读 · 2023年5月19日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

VIP会员

文章信息

相关主题

神经网络训练

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】深度神经网络的参数高效推理与训练

人工智能：实时战斗适应

【NeurIPS2025】MIDAS：一种基于错配的用于失衡多模态学习的数据增强策略

从感知到认知：多模态大语言模型中视觉-语言交互推理综述

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

深度学习医学图像分析文献集

深度学习医学图像分析文献集

机器学习研究会

19+阅读 · 2017年10月13日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Understanding the Risks and Rewards of Combining Unbiased and Possibly Biased Estimators, with Applications to Causal Inference

Arxiv

0+阅读 · 2023年5月24日

How precise are performance estimates for typical medical image segmentation tasks?

Arxiv

0+阅读 · 2023年5月24日

Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

Arxiv

0+阅读 · 2023年5月23日

Decoupled Kullback-Leibler Divergence Loss

Arxiv

0+阅读 · 2023年5月23日

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

Arxiv

0+阅读 · 2023年5月23日

Federated Variational Inference: Towards Improved Personalization and Generalization

Arxiv

0+阅读 · 2023年5月23日

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

Arxiv

0+阅读 · 2023年5月19日

SFP: Spurious Feature-targeted Pruning for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年5月19日

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning

Arxiv

0+阅读 · 2023年5月19日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

相关基金

ERG介导组蛋白修饰调控SLP2促进EMT在前列腺癌转移中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

毛囊干细胞microRNA调控毛乳头细胞诱导毛囊再生能力的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有Markov跳变参数的随机混合拟哈密顿系统的动力学与控制

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

血管平滑肌细胞表型相关lncRNA的筛查及其在转录调控中与核内microRNA的互作效应

国家自然科学基金

0+阅读 · 2012年12月31日

以信息为中心的网络（ICN）缓存机制性能评估与算法优化

国家自然科学基金

0+阅读 · 2012年12月31日

概率并发理论

国家自然科学基金

1+阅读 · 2011年12月31日

负浮力射流的卷积和失稳特性及其湍流形成机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员