存储比重- 递减梯级法 (Stochastic Bias-Reduced Gradient Methods) - 专知论文

会员服务 ·

0

估计/估计量 · Lipschitz · 优化器 · 泛函 · Performer ·

2021 年 6 月 17 日

Stochastic Bias-Reduced Gradient Methods

翻译：存储比重- 递减梯级法

Hilal Asi,Yair Carmon,Arun Jambulapati,Yujia Jin,Aaron Sidford

We develop a new primitive for stochastic optimization: a low-bias, low-cost estimator of the minimizer $x_\star$ of any Lipschitz strongly-convex function. In particular, we use a multilevel Monte-Carlo approach due to Blanchet and Glynn to turn any optimal stochastic gradient method into an estimator of $x_\star$ with bias $\delta$, variance $O(\log(1/\delta))$, and an expected sampling cost of $O(\log(1/\delta))$ stochastic gradient evaluations. As an immediate consequence, we obtain cheap and nearly unbiased gradient estimators for the Moreau-Yoshida envelope of any Lipschitz convex function, allowing us to perform dimension-free randomized smoothing. We demonstrate the potential of our estimator through four applications. First, we develop a method for minimizing the maximum of $N$ functions, improving on recent results and matching a lower bound up logarithmic factors. Second and third, we recover state-of-the-art rates for projection-efficient and gradient-efficient optimization using simple algorithms with a transparent analysis. Finally, we show that an improved version of our estimator would yield a nearly linear-time, optimal-utility, differentially-private non-smooth stochastic optimization method.

翻译：我们开发了一个用于随机优化的新型原始方法:一个低比值,低成本估算任何Lipschitz 强力混凝土功能的最小化美元(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Glynn) 。特别是,我们利用Blanchchet和Glynn,利用Blanchchetx函数的Moreau-Yoshida封获得廉价和近乎公正的梯度梯度测算器,将任何最佳的透析梯度方法转换成一个偏差为$xxxxxxxxx(1/delta)美元、差价O(O(log))$(oxoxoxoxoxoxoxoxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

0

相关内容

估计/估计量

估计/估计量

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

57+阅读 · 2020年11月21日

【Aalto博士论文】高效样本近似贝叶斯计算的高斯过程代理方法，84页pdf

专知会员服务

35+阅读 · 2020年9月30日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Stochastic Cluster Embedding

Arxiv

0+阅读 · 2021年8月18日

Stochastic loss reserving with mixture density neural networks

Arxiv

0+阅读 · 2021年8月18日

Risk Bounds for Quantile Trend Filtering

Arxiv

0+阅读 · 2021年8月17日

Non-Asymptotic Bounds for the $\ell_{\infty}$ Estimator in Linear Regression with Uniform Noise

Arxiv

0+阅读 · 2021年8月17日

Stochastic optimization under time drift: iterate averaging, step decay, and high probability guarantees

Arxiv

0+阅读 · 2021年8月16日

Adaptive Gradient Descent Methods for Computing Implied Volatility

Arxiv

0+阅读 · 2021年8月16日

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年8月15日

Efficient Byzantine-Resilient Stochastic Gradient Desce

Arxiv

0+阅读 · 2021年8月15日

Adaptive optimal estimation of irregular mean and covariance functions

Arxiv

0+阅读 · 2021年8月14日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

57+阅读 · 2020年11月21日

【Aalto博士论文】高效样本近似贝叶斯计算的高斯过程代理方法，84页pdf

专知会员服务

35+阅读 · 2020年9月30日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Stochastic Cluster Embedding

Arxiv

0+阅读 · 2021年8月18日

Stochastic loss reserving with mixture density neural networks

Arxiv

0+阅读 · 2021年8月18日

Risk Bounds for Quantile Trend Filtering

Arxiv

0+阅读 · 2021年8月17日

Non-Asymptotic Bounds for the $\ell_{\infty}$ Estimator in Linear Regression with Uniform Noise

Arxiv

0+阅读 · 2021年8月17日

Stochastic optimization under time drift: iterate averaging, step decay, and high probability guarantees

Arxiv

0+阅读 · 2021年8月16日

Adaptive Gradient Descent Methods for Computing Implied Volatility

Arxiv

0+阅读 · 2021年8月16日

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年8月15日

Efficient Byzantine-Resilient Stochastic Gradient Desce

Arxiv

0+阅读 · 2021年8月15日

Adaptive optimal estimation of irregular mean and covariance functions

Arxiv

0+阅读 · 2021年8月14日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

微信扫码咨询专知VIP会员