Dropout Reduces Underfitting - 专知论文

会员服务 ·

0

暂退法 · 欠拟合 · 可约的 · 正则化项 · MoDELS ·

2023 年 5 月 31 日

Dropout Reduces Underfitting

翻译：暂无翻译

Zhuang Liu,Zhiqiu Xu,Joseph Jin,Zhiqiang Shen,Trevor Darrell

from arxiv, ICML 2023

Introduced by Hinton et al. in 2012, dropout has stood the test of time as a regularizer for preventing overfitting in neural networks. In this study, we demonstrate that dropout can also mitigate underfitting when used at the start of training. During the early phase, we find dropout reduces the directional variance of gradients across mini-batches and helps align the mini-batch gradients with the entire dataset's gradient. This helps counteract the stochasticity of SGD and limit the influence of individual batches on model training. Our findings lead us to a solution for improving performance in underfitting models - early dropout: dropout is applied only during the initial phases of training, and turned off afterwards. Models equipped with early dropout achieve lower final training loss compared to their counterparts without dropout. Additionally, we explore a symmetric technique for regularizing overfitting models - late dropout, where dropout is not used in the early iterations and is only activated later in training. Experiments on ImageNet and various vision tasks demonstrate that our methods consistently improve generalization accuracy. Our results encourage more research on understanding regularization in deep learning and our methods can be useful tools for future neural network training, especially in the era of large data. Code is available at https://github.com/facebookresearch/dropout.

翻译：暂无翻译

0

相关内容

暂退法

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

71+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

47+阅读 · 2021年1月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

44+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

12+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

壳层隔绝纳米粒子增强拉曼光谱在近海海域污染物快速检测中应用的基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

祁连山冻土区天然气水合物微观结构特征及其动态聚散规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

糖尿病脑病中核酸交换因子Sil1参与内质网应激诱导神经元损伤机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

TrkB激动剂7,8-dihydroxyflavone对脆性X综合征突触可塑和学习记忆的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

深水柔性立管涡激振动与上部平台运动的动力耦合研究

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼发育过程中SINEs功能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

覆盖曲面理论、随机级数及算子不等式的若干研究

国家自然科学基金

0+阅读 · 2011年12月31日

Tribble3基因调控MAPK信号通路在表皮增殖及银屑病皮损形成中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

CO2促使离子液体与极性物质相分离的机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learned Thresholds Token Merging and Pruning for Vision Transformers

Arxiv

0+阅读 · 2023年7月20日

Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition

Arxiv

0+阅读 · 2023年7月19日

YOLIC: An Efficient Method for Object Localization and Classification on Edge Devices

Arxiv

0+阅读 · 2023年7月19日

Towards Sustainable Deep Learning for Multi-Label Classification on NILM

Arxiv

0+阅读 · 2023年7月18日

Quantum Network Discrimination

Arxiv

0+阅读 · 2023年7月14日

Enabling Deep Learning on Edge Devices

Arxiv

19+阅读 · 2022年10月6日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

32+阅读 · 2022年4月25日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

35+阅读 · 2019年9月18日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

71+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

47+阅读 · 2021年1月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

44+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

12+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

热门VIP内容

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

相关论文

Learned Thresholds Token Merging and Pruning for Vision Transformers

Arxiv

0+阅读 · 2023年7月20日

Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition

Arxiv

0+阅读 · 2023年7月19日

YOLIC: An Efficient Method for Object Localization and Classification on Edge Devices

Arxiv

0+阅读 · 2023年7月19日

Towards Sustainable Deep Learning for Multi-Label Classification on NILM

Arxiv

0+阅读 · 2023年7月18日

Quantum Network Discrimination

Arxiv

0+阅读 · 2023年7月14日

Enabling Deep Learning on Edge Devices

Arxiv

19+阅读 · 2022年10月6日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

32+阅读 · 2022年4月25日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

35+阅读 · 2019年9月18日

相关基金

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

壳层隔绝纳米粒子增强拉曼光谱在近海海域污染物快速检测中应用的基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

祁连山冻土区天然气水合物微观结构特征及其动态聚散规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

糖尿病脑病中核酸交换因子Sil1参与内质网应激诱导神经元损伤机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

TrkB激动剂7,8-dihydroxyflavone对脆性X综合征突触可塑和学习记忆的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

深水柔性立管涡激振动与上部平台运动的动力耦合研究

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼发育过程中SINEs功能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

覆盖曲面理论、随机级数及算子不等式的若干研究

国家自然科学基金

0+阅读 · 2011年12月31日

Tribble3基因调控MAPK信号通路在表皮增殖及银屑病皮损形成中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

CO2促使离子液体与极性物质相分离的机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员