使用简单的梯度- 白白算法 (Escape saddle points by a simple gradient-descent based algorithm) - 专知论文

会员服务 ·

0

鞍点 · SimPLe · 驻点 · 平稳的 · 幂法 ·

2021 年 11 月 28 日

Escape saddle points by a simple gradient-descent based algorithm

翻译：使用简单的梯度- 白白算法

Chenyi Zhang,Tongyang Li

from arxiv, 34 pages, 8 figures, to appear in the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$, it outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}(\log n/\epsilon^{1.75})$ iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with $\tilde{O}((\log n)^{4}/\epsilon^{2})$ or $\tilde{O}((\log n)^{6}/\epsilon^{1.75})$ iterations, our algorithm is polynomially better in terms of $\log n$ and matches their complexities in terms of $1/\epsilon$. For the stochastic setting, our algorithm outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}((\log n)^{2}/\epsilon^{4})$ iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in $\log n$ compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.

翻译：换马鞍点是非convex 优化的中央研究主题。在本文中, 我们提出一个简单的基于梯度的算法, 这样对于一个光滑函数 $f\\ cron\ mathb{R\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\R}R}$, 它产生一个$\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

4

相关内容

在数学中，鞍点或极大极小点是函数图形表面上的一点，其正交方向上的斜率(导数)都为零，但它不是函数的局部极值。鞍点是在某一轴向(峰值之间)有一个相对最小的临界点，在交叉轴上有一个相对最大的临界点。

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

无人机

29+阅读 · 2019年5月2日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（四）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（四）

泡泡机器人SLAM

14+阅读 · 2019年4月30日

【泡泡来稿】基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

【泡泡来稿】基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

泡泡机器人SLAM

5+阅读 · 2019年4月27日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Fast quantum subroutines for the simplex method

Arxiv

0+阅读 · 2022年1月31日

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Arxiv

0+阅读 · 2022年1月30日

SRKCD: a stabilized Runge-Kutta method for stochastic optimization

Arxiv

0+阅读 · 2022年1月30日

Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

Arxiv

0+阅读 · 2022年1月30日

Coordinate Descent Methods for Fractional Minimization

Arxiv

0+阅读 · 2022年1月30日

Power-law escape rate of SGD

Arxiv

0+阅读 · 2022年1月29日

Improving Group Testing via Gradient Descent

Arxiv

0+阅读 · 2022年1月28日

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Arxiv

0+阅读 · 2022年1月28日

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(ε^{-7/4})$ Complexity

Arxiv

0+阅读 · 2022年1月27日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

无人机

29+阅读 · 2019年5月2日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（四）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（四）

泡泡机器人SLAM

14+阅读 · 2019年4月30日

【泡泡来稿】基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

【泡泡来稿】基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

泡泡机器人SLAM

5+阅读 · 2019年4月27日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Fast quantum subroutines for the simplex method

Arxiv

0+阅读 · 2022年1月31日

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Arxiv

0+阅读 · 2022年1月30日

SRKCD: a stabilized Runge-Kutta method for stochastic optimization

Arxiv

0+阅读 · 2022年1月30日

Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

Arxiv

0+阅读 · 2022年1月30日

Coordinate Descent Methods for Fractional Minimization

Arxiv

0+阅读 · 2022年1月30日

Power-law escape rate of SGD

Arxiv

0+阅读 · 2022年1月29日

Improving Group Testing via Gradient Descent

Arxiv

0+阅读 · 2022年1月28日

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Arxiv

0+阅读 · 2022年1月28日

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(ε^{-7/4})$ Complexity

Arxiv

0+阅读 · 2022年1月27日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员