Wide neural networks with random weights and biases are Gaussian processes, as observed by Neal (1995) for shallow networks, and more recently by Lee et al. (2018) and Matthews et al. (2018) for deep fully-connected networks, as well as by Novak et al. (2019) and Garriga-Alonso et al. (2019) for deep convolutional networks. We show that this Neural Network-Gaussian Process correspondence surprisingly extends to all modern feedforward or recurrent neural networks composed of multilayer perceptron, RNNs (e.g. LSTMs, GRUs), (nD or graph) convolution, pooling, skip connection, attention, batch normalization, and/or layer normalization. More generally, we introduce a language for expressing neural network computations, and our result encompasses all such expressible neural networks. This work serves as a tutorial on the *tensor programs* technique formulated in Yang (2019) and elucidates the Gaussian Process results obtained there. We provide open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network at github.com/thegregyang/GP4A.
翻译:具有随机权重和偏差的宽度神经网络是Gausian过程,Neal(1995年)观测到这些过程,Lee等人(2018年)和Matthews等人(2018年)观测到浅端网络,Novak等人(2019年)和Garriga-Alonso等人(2019年)观测到这些过程,还有深层进化网络(2019年)和Garriga-Alonso等人(2019年)观测到这些过程。我们显示,神经网络-Gausian过程的通信令人惊讶地扩展到由多层透镜、RNNN(如LSTMS、GRUs)、(nD或图形)演化、汇集、跳过连接、注意、批次正常化和/或层正常化等组成的所有现代神经网络。我们采用一种语言来表达神经网络的计算,我们的结果包含所有这些清晰的神经网络。我们的工作是对在Yang(2019年)开发的* 高级程序的技术进行辅导,并阐明在那里取得的结果。我们提供在简单的RNNNN、GGGGAR4、变式和RGRRGRRRRRRRGRR和DRRRRRRRTAR的GR和GNU4的GNU的GNG和GNGNGNGNGAR网络的公开源内核和G。