《关于可分离的数据的梯级后裔隐含的隐含偏见》 (The Implicit Bias of Gradient Descent on Separable Data) - 专知论文

会员服务 ·

0

预测器/决策函数 · 分离的 · 损失 · 硬间隔 · 优化器 ·

2022 年 7 月 19 日

The Implicit Bias of Gradient Descent on Separable Data

翻译：《关于可分离的数据的梯级后裔隐含的隐含偏见》

Daniel Soudry,Elad Hoffer,Mor Shpigel Nacson,Suriya Gunasekar,Nathan Srebro

from arxiv, Fixed a few minor issues in v4: typo in assumption 2, Latex issue in Lemma 1, and added a few words to proof sketch

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. The result also generalizes to other monotone decreasing loss functions with an infimum at infinity, to multi-class problems, and to training a weight layer in a deep network in a certain restricted setting. Furthermore, we show this convergence is very slow, and only logarithmic in the convergence of the loss itself. This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero and the training loss is extremely small, and, as we show, even if the validation loss increases. Our methodology can also aid in understanding implicit regularization n more complex models and with other optimization methods.

翻译：我们研究的是不正规的后勤回归问题的梯度下降,在线性分离的数据集中存在同质线性线性预测器。我们展示了预测器与最大差值(硬差SVM)解决方案的方向相趋同。结果还概括了其他单质减缩损失功能,其最小值为无限值,多级问题,在某种限制环境下在深网络中培训一个重量层。此外,我们显示这种趋同非常缓慢,在损失本身的趋同上只有逻辑性。这可以帮助解释即使在培训错误为零,培训损失也极小,而且我们表明,即使验证损失增加,我们的方法也可以帮助理解隐含的、更复杂的模型和其他优化方法。

0

相关内容

预测器/决策函数

预测器/决策函数

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

PI3K/Nrf2信号通路协同调控乳腺癌EMT及侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

HGF/c-Met介导COL1A2在年龄相关性黄斑变性发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

瓜环类分子容器对IBX氧化反应的选择性催化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Robust explicit estimation of the log-logistic distribution with applications

Arxiv

0+阅读 · 2022年9月15日

Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Arxiv

0+阅读 · 2022年9月15日

Decentralized Learning with Separable Data: Generalization and Fast Algorithms

Arxiv

0+阅读 · 2022年9月15日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

预测器/决策函数

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美海军作战管理系统：变革战场空间的二十年

《任务与武器驱动美海军舰队设计》报告

俄罗斯“沙希德”/“天竺葵”攻击无人机

《利用动态图对网络攻击进行建模与仿真：在云安全评估中的应用》90页

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Robust explicit estimation of the log-logistic distribution with applications

Arxiv

0+阅读 · 2022年9月15日

Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Arxiv

0+阅读 · 2022年9月15日

Decentralized Learning with Separable Data: Generalization and Fast Algorithms

Arxiv

0+阅读 · 2022年9月15日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

PI3K/Nrf2信号通路协同调控乳腺癌EMT及侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

HGF/c-Met介导COL1A2在年龄相关性黄斑变性发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

瓜环类分子容器对IBX氧化反应的选择性催化研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员