神经网络作为核心学习者:静静对齐效应 (Neural Networks as Kernel Learners: The Silent Alignment Effect) - 专知论文

会员服务 ·

0

核化 · Networking · Neural Networks · 核回归 · 学成 ·

2021 年 12 月 2 日

Neural Networks as Kernel Learners: The Silent Alignment Effect

翻译：神经网络作为核心学习者:静静对齐效应

Alexander Atanasov,Blake Bordelon,Cengiz Pehlevan

from arxiv, 29 pages, 15 figures. Added additional experiments and expanded the derivations in the appendix

Neural networks in the lazy training regime converge to kernel machines. Can neural networks in the rich feature learning regime learn a kernel machine with a data-dependent kernel? We demonstrate that this can indeed happen due to a phenomenon we term silent alignment, which requires that the tangent kernel of a network evolves in eigenstructure while small and before the loss appreciably decreases, and grows only in overall scale afterwards. We show that such an effect takes place in homogenous neural networks with small initialization and whitened data. We provide an analytical treatment of this effect in the linear network case. In general, we find that the kernel develops a low-rank contribution in the early phase of training, and then evolves in overall scale, yielding a function equivalent to a kernel regression solution with the final network's tangent kernel. The early spectral learning of the kernel depends on the depth. We also demonstrate that non-whitened data can weaken the silent alignment effect.

翻译：懒惰训练制度中的神经网络与内核相融合。丰富特性学习制度中的神经网络能否用数据依赖内核来学习内核机? 我们证明,这之所以能够发生,是因为我们使用静静的对齐,这要求一个网络的相近内核在机体结构中演化,而小的和在损失明显减少之前,并且只是在整体规模上发展。我们显示,这种效应发生在具有小初始化和白化数据的同质神经网络中。我们在线性网络案例中对这种效应进行了分析处理。一般来说,我们发现内核在训练的早期阶段发展了低级贡献,然后在总体规模上演化,产生相当于最后网络的正热内核内核内核的内核回归溶液的功能。内核早期光学取决于深度。我们还表明,非白色数据可以削弱静态对接合效应。

0

相关内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

最新《几何深度学习》教程，100页ppt，Geometric Deep Learning

最新《几何深度学习》教程，100页ppt，Geometric Deep Learning

专知

12+阅读 · 2020年7月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

已删除

将门创投

4+阅读 · 2018年11月15日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data

Arxiv

0+阅读 · 2022年2月7日

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

Arxiv

0+阅读 · 2022年2月4日

Out-of-Distribution Generalization in Kernel Regression

Arxiv

0+阅读 · 2022年2月4日

Learning strides in convolutional neural networks

Arxiv

0+阅读 · 2022年2月3日

Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks

Arxiv

4+阅读 · 2021年12月7日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Arxiv

8+阅读 · 2019年4月22日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

最新《几何深度学习》教程，100页ppt，Geometric Deep Learning

最新《几何深度学习》教程，100页ppt，Geometric Deep Learning

专知

12+阅读 · 2020年7月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

已删除

将门创投

4+阅读 · 2018年11月15日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data

Arxiv

0+阅读 · 2022年2月7日

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

Arxiv

0+阅读 · 2022年2月4日

Out-of-Distribution Generalization in Kernel Regression

Arxiv

0+阅读 · 2022年2月4日

Learning strides in convolutional neural networks

Arxiv

0+阅读 · 2022年2月3日

Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks

Arxiv

4+阅读 · 2021年12月7日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Arxiv

8+阅读 · 2019年4月22日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

微信扫码咨询专知VIP会员