Analyzing neural network dynamics via stochastic gradient descent (SGD) is crucial to building theoretical foundations for deep learning. Previous work has analyzed structured inputs within the \textit{hidden manifold model}, often under the simplifying assumption of a Gaussian distribution. We extend this framework by modeling inputs as Gaussian mixtures to better represent complex, real-world data. Through empirical and theoretical investigation, we demonstrate that with proper standardization, the learning dynamics converges to the behavior seen in the simple Gaussian case. This finding exhibits a form of universality, where diverse structured distributions yield results consistent with Gaussian assumptions, thereby strengthening the theoretical understanding of deep learning models.
翻译:通过随机梯度下降(SGD)分析神经网络动力学对于构建深度学习的理论基础至关重要。先前的研究主要在《隐流形模型》框架下分析结构化输入,通常采用高斯分布的简化假设。我们通过将输入建模为高斯混合模型来扩展该框架,以更好地表示复杂的现实世界数据。通过实证与理论探究,我们证明在适当的标准化条件下,学习动力学会收敛至简单高斯情形中观察到的行为。这一发现展现了一种普适性形式,即多样化的结构化分布会产生与高斯假设一致的结果,从而深化了对深度学习模型的理论理解。