非convex C ⁇ 1+alpha} 成本函数的梯度算法趋同 (Convergence of Gradient Algorithms for Nonconvex C^{1+alpha} Cost Functions) - 专知论文

会员服务 ·

0

非凸 · 代价函数 · 动量 · 几乎必然收敛 · 泛函 ·

2021 年 9 月 30 日

Convergence of Gradient Algorithms for Nonconvex C^{1+alpha} Cost Functions

翻译：非convex C ⁇ 1+alpha} 成本函数的梯度算法趋同

Zixuan Wang,Shanjian Tang

from arxiv, 16 pages

This paper is concerned with convergence of stochastic gradient algorithms with momentum terms in the nonconvex setting. A class of stochastic momentum methods, including stochastic gradient descent, heavy ball, and Nesterov's accelerated gradient, is analyzed in a general framework under mild assumptions. Based on the convergence result of expected gradients, we prove the almost sure convergence by a detailed discussion of the effects of momentum and the number of upcrossings. It is worth noting that there are not additional restrictions imposed on the objective function and stepsize. Another improvement over previous results is that the existing Lipschitz condition of the gradient is relaxed into the condition of Holder continuity. As a byproduct, we apply a localization procedure to extend our results to stochastic stepsizes.

翻译：本文涉及随机梯度算法与非电流设置中动力值的趋同问题。在轻度假设下,在总体框架内分析了一组随机梯度动力学方法,包括随机梯度下降、重球和内斯特罗夫加速梯度。根据预期梯度的趋同结果,我们通过详细讨论动力效应和交错次数,证明几乎可以肯定地趋同。值得指出的是,对客观功能和分级没有附加的限制。与以往相比,另一个改进是,现有的Lipschitz 梯度状况已放松到Holder的连续性状态。作为副产品,我们应用了本地化程序来扩大我们的结果,以进行分级化。

0

相关内容

【伯克利经典书】图模型,指数族与变分推断，305页pdf

专知会员服务

49+阅读 · 2021年8月1日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

Arxiv

0+阅读 · 2021年11月23日

A Global Two-stage Algorithm for Non-convex Penalized High-dimensional Linear Regression Problems

Arxiv

0+阅读 · 2021年11月23日

Convergence of sequences: a survey

Arxiv

0+阅读 · 2021年11月22日

Private and polynomial time algorithms for learning Gaussians and beyond

Arxiv

0+阅读 · 2021年11月22日

Learning PSD-valued functions using kernel sums-of-squares

Arxiv

0+阅读 · 2021年11月22日

Convergence analysis of numerical schemes for the Darcy-Forchheimer problem

Arxiv

0+阅读 · 2021年11月22日

Gradient Temporal Difference with Momentum: Stability and Convergence

Arxiv

0+阅读 · 2021年11月22日

Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits

Arxiv

0+阅读 · 2021年11月19日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

VIP会员

文章信息

相关主题

几乎必然收敛

相关VIP内容

【伯克利经典书】图模型,指数族与变分推断，305页pdf

专知会员服务

49+阅读 · 2021年8月1日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

Arxiv

0+阅读 · 2021年11月23日

A Global Two-stage Algorithm for Non-convex Penalized High-dimensional Linear Regression Problems

Arxiv

0+阅读 · 2021年11月23日

Convergence of sequences: a survey

Arxiv

0+阅读 · 2021年11月22日

Private and polynomial time algorithms for learning Gaussians and beyond

Arxiv

0+阅读 · 2021年11月22日

Learning PSD-valued functions using kernel sums-of-squares

Arxiv

0+阅读 · 2021年11月22日

Convergence analysis of numerical schemes for the Darcy-Forchheimer problem

Arxiv

0+阅读 · 2021年11月22日

Gradient Temporal Difference with Momentum: Stability and Convergence

Arxiv

0+阅读 · 2021年11月22日

Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits

Arxiv

0+阅读 · 2021年11月19日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

微信扫码咨询专知VIP会员