用于内插网络的斯托卡特套装方法 (A Stochastic Bundle Method for Interpolating Networks) - 专知论文

会员服务 ·

0

经验损失 · 近似 · 稳健性 · 线性的 · Networking ·

2022 年 1 月 29 日

A Stochastic Bundle Method for Interpolating Networks

翻译：用于内插网络的斯托卡特套装方法

Alasdair Paren,Leonard Berrada,Rudra P. K. Poudel,M. Pawan Kumar

We propose a novel method for training deep neural networks that are capable of interpolation, that is, driving the empirical loss to zero. At each iteration, our method constructs a stochastic approximation of the learning objective. The approximation, known as a bundle, is a pointwise maximum of linear functions. Our bundle contains a constant function that lower bounds the empirical loss. This enables us to compute an automatic adaptive learning rate, thereby providing an accurate solution. In addition, our bundle includes linear approximations computed at the current iterate and other linear estimates of the DNN parameters. The use of these additional approximations makes our method significantly more robust to its hyperparameters. Based on its desirable empirical properties, we term our method Bundle Optimisation for Robust and Accurate Training (BORAT). In order to operationalise BORAT, we design a novel algorithm for optimising the bundle approximation efficiently at each iteration. We establish the theoretical convergence of BORAT in both convex and non-convex settings. Using standard publicly available data sets, we provide a thorough comparison of BORAT to other single hyperparameter optimisation algorithms. Our experiments demonstrate BORAT matches the state-of-the-art generalisation performance for these methods and is the most robust.

翻译：我们提出了一个能够进行内插的深神经网络培训的新颖方法,即将实验性损失降至零。在每次迭代中,我们的方法构建了学习目标的随机近似值。近似值被称为捆绑,是线性函数的最大点。我们的捆绑包含一个固定的函数,可以降低经验性损失的界限。这使我们能够计算自动适应性学习率,从而提供准确的解决方案。此外,我们的捆绑包括了在DNN参数的当前环流和其他线性估计中计算出的线性近似值。使用这些额外近似值,使我们的方法对它的超参数更加强大。基于其理想的经验性能,我们用“Bungdle优化”法来形容结晶和精准性培训(BORAT)。为了操作BORAT,我们设计了一个新的算法,以便在每次试运行时都能够有效地优化捆绑性近值。我们用BORAT的理论结合了目前对DNNN参数和非CON的理论性估计。使用这些额外的近似值使我们的方法变得比得更强得多。我们用标准公开的数据集,我们用BORAT最强的实验法来演示其他一比AAT的实验方法。

0

相关内容

经验损失

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

对偶三角模-余模逻辑的语义理论与应用

国家自然科学基金

0+阅读 · 2014年12月31日

复杂水力耦合下水轮发电机组哈密顿稳定控制策略研究

国家自然科学基金

1+阅读 · 2014年12月31日

随机进程代数模型的Fluid逼近问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

电力企业发电调度与燃料库存管理集成优化研究

国家自然科学基金

3+阅读 · 2013年12月31日

超图中的一些极值问题

国家自然科学基金

0+阅读 · 2012年12月31日

不确定性平衡优化理论及其应用

国家自然科学基金

1+阅读 · 2012年12月31日

动态几何分析与三维重建

国家自然科学基金

2+阅读 · 2012年12月31日

神经网络的代数构造特征和可算性

国家自然科学基金

3+阅读 · 2011年12月31日

图的染色和控制集问题的理论和算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

Arxiv

0+阅读 · 2022年4月20日

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence

Arxiv

0+阅读 · 2022年4月20日

Tight Last-Iterate Convergence of the Extragradient Method for Constrained Monotone Variational Inequalities

Arxiv

0+阅读 · 2022年4月20日

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

Arxiv

0+阅读 · 2022年4月19日

A stochastic Stein Variational Newton method

Arxiv

0+阅读 · 2022年4月19日

A Deep Learning Galerkin Method for the Closed-Loop Geothermal System

Arxiv

0+阅读 · 2022年4月18日

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Arxiv

0+阅读 · 2022年4月17日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

Provable Convergence of Nesterov's Accelerated Gradient Method for Over-Parameterized Neural Networks

Arxiv

0+阅读 · 2022年4月16日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【EMNLP2025最佳论文】INFINI-GRAM MINI：基于 FM-Index 的互联网级精确 n-gram 搜索

【EMNLP2025教程】高效的大语言模型推理：算法、模型与系统，203页ppt

AI医疗行业研究报告：AI医疗前景广阔

【斯坦福博士论文】多模态基础模型：从科学理解到科学发现

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

Arxiv

0+阅读 · 2022年4月20日

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence

Arxiv

0+阅读 · 2022年4月20日

Tight Last-Iterate Convergence of the Extragradient Method for Constrained Monotone Variational Inequalities

Arxiv

0+阅读 · 2022年4月20日

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

Arxiv

0+阅读 · 2022年4月19日

A stochastic Stein Variational Newton method

Arxiv

0+阅读 · 2022年4月19日

A Deep Learning Galerkin Method for the Closed-Loop Geothermal System

Arxiv

0+阅读 · 2022年4月18日

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Arxiv

0+阅读 · 2022年4月17日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

Provable Convergence of Nesterov's Accelerated Gradient Method for Over-Parameterized Neural Networks

Arxiv

0+阅读 · 2022年4月16日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

相关基金

对偶三角模-余模逻辑的语义理论与应用

国家自然科学基金

0+阅读 · 2014年12月31日

复杂水力耦合下水轮发电机组哈密顿稳定控制策略研究

国家自然科学基金

1+阅读 · 2014年12月31日

随机进程代数模型的Fluid逼近问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

电力企业发电调度与燃料库存管理集成优化研究

国家自然科学基金

3+阅读 · 2013年12月31日

超图中的一些极值问题

国家自然科学基金

0+阅读 · 2012年12月31日

不确定性平衡优化理论及其应用

国家自然科学基金

1+阅读 · 2012年12月31日

动态几何分析与三维重建

国家自然科学基金

2+阅读 · 2012年12月31日

神经网络的代数构造特征和可算性

国家自然科学基金

3+阅读 · 2011年12月31日

图的染色和控制集问题的理论和算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员