关于加速Nesterov的连续视图 (A Continuized View on Nesterov Acceleration)

We introduce the "continuized" Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter. The two variables continuously mix following a linear ordinary differential equation and take gradient steps at random times. This continuized variant benefits from the best of the continuous and the discrete frameworks: as a continuous process, one can use differential calculus to analyze convergence and obtain analytical expressions for the parameters; but a discretization of the continuized process can be computed exactly with convergence rates similar to those of Nesterov original acceleration. We show that the discretization has the same structure as Nesterov acceleration, but with random parameters.

翻译：我们引入了“连续”内斯特罗夫加速度,这是内斯特罗夫加速度的一种近似变体,其变量由连续时间参数索引。两个变量按照直线普通差分方程式不断混合,并在随机时间采取梯度步骤。这个连续变体从最好的连续和离散框架中受益:作为一个连续过程,可以使用不同的微积分分析趋同和参数的分析表达方式;但是,可以完全按照与内斯特罗夫原加速度相似的趋同率来计算内斯特罗夫加速度的相分离过程。我们表明,离散变体的结构与内斯特罗夫加速度相同,但有随机参数。

相关内容

Continuity

关注 0

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

如何写一份有效的机器学习/自然语言处理论文摘要？ Elvis Saravia

专知会员服务

38+阅读 · 2020年5月17日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日