This paper studies the infinite-width limit of deep linear neural networks initialized with random parameters. We obtain that, when the number of neurons diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear neural network. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of neurons. We finally study the continuous-time limit obtained for infinitely wide linear neural networks and show that the linear predictors of the neural network converge at an exponential rate to the minimal $\ell_2$-norm minimizer of the risk.
翻译:本文研究以随机参数初始的深线神经网络的无限线性极限。 我们发现,当神经神经数量出现差异时, 训练动态( 确切地说) 将( ) 与从无限宽的确定线性神经网络的梯度下降中获得的动态相融合。 此外, 即使重量仍然是随机的, 我们也可以根据训练动态得到精确的定律, 并证明线性预测器在神经数量上的数量趋同结果。 我们最后研究无限宽线性神经网络获得的连续时间限制, 并显示神经网络的线性预测器以指数速度聚集到最低风险的 $@ ell_ 2$- 诺姆最小值。