神经网络培训最低行动原则 (A Principle of Least Action for the Training of Neural Networks)

Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternate perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network's behavior through its displacements, we show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: finding neural networks which solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the given task, and leads to networks with a high generalization ability even in low data regimes.

翻译：典型的统计学理论理论在解释这种行为时,最近花了很多精力去发现背后的机制,希望开发一个更适当的理论框架,更好地控制经过培训的模型。在这项工作中,我们采取了另一种观点,将神经网络视为一个动态系统,随着时间的推移取代输入粒子。我们进行了一系列实验,并通过分析网络的迁移行为,发现网络运输图中存在一种低动能迁移偏差,并将这种偏差与普遍化性表现联系起来。我们从这一观察中将学习问题改写如下:寻找神经网络,在尽可能高效地传输数据的同时解决任务。这为学习问题提供了新颖的提法,使我们能够根据最佳运输理论为解决方案网络提供定期的结果。从实际角度看,我们可提出一种新的学习算法,以自动适应给定任务的复杂性,并导致即使在低数据系统中也具有高度普遍化能力的网络。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

专知会员服务

84+阅读 · 2020年4月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日