减少一般适应性存储镜源方面的差异 (Variance Reduction on General Adaptive Stochastic Mirror Descent) - 专知论文

会员服务 ·

0

方差减小 · 可约的 · 方差 · 非凸 · Extensibility ·

2021 年 8 月 29 日

Variance Reduction on General Adaptive Stochastic Mirror Descent

翻译：减少一般适应性存储镜源方面的差异

Wenjie Li,Zhanyu Wang,Yichen Zhang,Guang Cheng

from arxiv, NeurIPS 2020 OPT workshop

In this work, we investigate the idea of variance reduction by studying its properties with general adaptive mirror descent algorithms in nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized framework for variance reduced adaptive mirror descent algorithms named SVRAMD and provide its convergence analysis in both the nonsmooth nonconvex problem and the P-L conditioned problem. We prove that variance reduction reduces the SFO complexity of adaptive mirror descent algorithms and thus accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms without complicated algorithmic components. Extensive experiments in deep learning validate our theoretical findings.

翻译：在这项工作中,我们通过研究差异减少的概念,研究其特性,在非单向非相向非相向的有限和优化问题中,以一般的适应性镜底降序算法来研究差异减少的概念。我们提议了一个简单而普遍的框架,用于差异减少的适应性镜底降序算法,名为SVRAMD, 并对非偏向非相向非和P-L条件性问题进行趋同分析。我们证明差异减少会降低适应性镜底降序算法的SFO复杂性,从而加速其趋同。特别是,我们的一般理论表明,差异减少可以适用于使用时间变化的步数和自我适应性算法的算法,如AdaGrad和RMSProp。此外,SVRAMD的趋同率恢复了现有最佳的非适应性差异降低镜底降序算法的速率,而没有复杂的算法组成部分。深入学习的实验证实了我们的理论结论。

0

相关内容

方差减小

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

将门创投

4+阅读 · 2019年10月11日

Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年10月21日

Adaptive Gradient Descent for Optimal Control of Parabolic Equations with Random Parameters

Arxiv

0+阅读 · 2021年10月20日

On the Global Convergence of Momentum-based Policy Gradient

Arxiv

0+阅读 · 2021年10月19日

A Global Stochastic Optimization Particle Filter Algorithm

Arxiv

0+阅读 · 2021年10月18日

Adaptive Tikhonov strategies for stochastic ensemble Kalman inversion

Arxiv

0+阅读 · 2021年10月18日

Variance-Reduced Splitting Schemes for Monotone Stochastic Generalized Equations

Arxiv

0+阅读 · 2021年10月18日

Structured second-order methods via natural gradient descent

Arxiv

0+阅读 · 2021年10月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向未来部队设计的兵棋推演：解锁过程中的作战艺术》

《模拟空域：释放人工智能实现自适应空中防御》2025年最新文献

《迈向真正的机器人队友：推断与运用认知状态以实现新型人类-自主系统协作能力》最新博士论文

《面向开放式兵棋推演的语言模型》2025最新文献

相关资讯

已删除

将门创投

4+阅读 · 2019年10月11日

相关论文

Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年10月21日

Adaptive Gradient Descent for Optimal Control of Parabolic Equations with Random Parameters

Arxiv

0+阅读 · 2021年10月20日

On the Global Convergence of Momentum-based Policy Gradient

Arxiv

0+阅读 · 2021年10月19日

A Global Stochastic Optimization Particle Filter Algorithm

Arxiv

0+阅读 · 2021年10月18日

Adaptive Tikhonov strategies for stochastic ensemble Kalman inversion

Arxiv

0+阅读 · 2021年10月18日

Variance-Reduced Splitting Schemes for Monotone Stochastic Generalized Equations

Arxiv

0+阅读 · 2021年10月18日

Structured second-order methods via natural gradient descent

Arxiv

0+阅读 · 2021年10月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

微信扫码咨询专知VIP会员