带有本地损失的向前梯度 (Scaling Forward Gradient With Local Losses) - 专知论文

会员服务 ·

0

前向 · Learning · 损失 · 方差 · 缩放 ·

2023 年 2 月 17 日

Scaling Forward Gradient With Local Losses

翻译：带有本地损失的向前梯度

Mengye Ren,Simon Kornblith,Renjie Liao,Geoffrey Hinton

from arxiv, 31 pages, ICLR 2023

Forward gradient learning computes a noisy directional gradient and is a biologically plausible alternative to backprop for learning deep neural networks. However, the standard forward gradient algorithm, when applied naively, suffers from high variance when the number of parameters to be learned is large. In this paper, we propose a series of architectural and algorithmic modifications that together make forward gradient learning practical for standard deep learning benchmark tasks. We show that it is possible to substantially reduce the variance of the forward gradient estimator by applying perturbations to activations rather than weights. We further improve the scalability of forward gradient by introducing a large number of local greedy loss functions, each of which involves only a small number of learnable parameters, and a new MLPMixer-inspired architecture, LocalMixer, that is more suitable for local learning. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.

翻译：前方梯度学习计算出一个吵闹的方向梯度,是用于学习深神经网络的反向偏差的一种生物上可行的替代方法。但是,标准的前方梯度算法,如果天真地应用,当需要学习的参数数量巨大时,就会有很大差异。在本文中,我们提出了一系列的建筑和算法修改,使前方梯度学习对标准的深学习基准任务具有实用性。我们表明,通过对激活应用扰动而非权重来大幅降低远梯度测量器的差异是可能的。我们通过引入大量本地贪婪损失功能,进一步提高前方梯度的可缩放性,其中每种功能都只涉及少量的可学习参数,以及一个新的MLPMixer启发型架构,即本地混合器,更适合本地学习。我们的方法与MNIST和CIFAR-10的背法相匹配,并大大超出先前在图像网上提议的反向偏向式算法。

0

相关内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

重金属离子胁迫下花斑裸鲤钙调蛋白磷酸酶(Calcineurin)的应答及其分子调节机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

LOC283683-NIPA1-BMPRII途径对胆固醇平衡和动脉粥样硬化的影响及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型钛基氧化物MTi5O11(M=Ca,Sr,Ba)物性调控及其光解水机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于水转化及作物生长多过程耦合的子牙河平原农业干旱模拟评估研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米材料的表面修饰对聚合物/无机杂化体系光电池性能的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

SrxBa1-xNb2O6纳米陶瓷与薄膜的电卡效应

国家自然科学基金

0+阅读 · 2012年12月31日

流体动力学若干模型的定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

葛根素对性未成熟小白鼠发育不同阶段激素含量和乳腺组织中受体基因表达及泌乳作用的影响研究

国家自然科学基金

0+阅读 · 2008年12月31日

An Offline Risk-aware Policy Selection Method for Bayesian Markov Decision Processes

Arxiv

0+阅读 · 2023年4月11日

Actually Sparse Variational Gaussian Processes

Arxiv

0+阅读 · 2023年4月11日

Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces

Arxiv

0+阅读 · 2023年4月10日

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

Arxiv

0+阅读 · 2023年4月10日

Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions

Arxiv

0+阅读 · 2023年4月8日

Inference on Optimal Dynamic Policies via Softmax Approximation

Arxiv

0+阅读 · 2023年4月7日

Neural Operator: Learning Maps Between Function Spaces

Arxiv

0+阅读 · 2023年4月7日

Max-Margin Contrastive Learning

Max-Margin Contrastive Learning

Arxiv

18+阅读 · 2021年12月21日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

An Offline Risk-aware Policy Selection Method for Bayesian Markov Decision Processes

Arxiv

0+阅读 · 2023年4月11日

Actually Sparse Variational Gaussian Processes

Arxiv

0+阅读 · 2023年4月11日

Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces

Arxiv

0+阅读 · 2023年4月10日

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

Arxiv

0+阅读 · 2023年4月10日

Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions

Arxiv

0+阅读 · 2023年4月8日

Inference on Optimal Dynamic Policies via Softmax Approximation

Arxiv

0+阅读 · 2023年4月7日

Neural Operator: Learning Maps Between Function Spaces

Arxiv

0+阅读 · 2023年4月7日

Max-Margin Contrastive Learning

Max-Margin Contrastive Learning

Arxiv

18+阅读 · 2021年12月21日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

重金属离子胁迫下花斑裸鲤钙调蛋白磷酸酶(Calcineurin)的应答及其分子调节机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Actinophyllic Acid类含七元环的复杂多环活性天然产物全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

LOC283683-NIPA1-BMPRII途径对胆固醇平衡和动脉粥样硬化的影响及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型钛基氧化物MTi5O11(M=Ca,Sr,Ba)物性调控及其光解水机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于水转化及作物生长多过程耦合的子牙河平原农业干旱模拟评估研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米材料的表面修饰对聚合物/无机杂化体系光电池性能的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

SrxBa1-xNb2O6纳米陶瓷与薄膜的电卡效应

国家自然科学基金

0+阅读 · 2012年12月31日

流体动力学若干模型的定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

葛根素对性未成熟小白鼠发育不同阶段激素含量和乳腺组织中受体基因表达及泌乳作用的影响研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员