We provide a series of results for unsupervised learning with autoencoders. Specifically, we study shallow two-layer autoencoder architectures with shared weights. We focus on three generative models for data that are common in statistical machine learning: (i) the mixture-of-gaussians model, (ii) the sparse coding model, and (iii) the sparsity model with non-negative coefficients. For each of these models, we prove that under suitable choices of hyperparameters, architectures, and initialization, autoencoders learned by gradient descent can successfully recover the parameters of the corresponding model. To our knowledge, this is the first result that rigorously studies the dynamics of gradient descent for weight-sharing autoencoders. Our analysis can be viewed as theoretical evidence that shallow autoencoder modules indeed can be used as feature learning mechanisms for a variety of data models, and may shed insight on how to train larger stacked architectures with autoencoders as basic building blocks.
翻译:我们为与自动编码器进行不受监督的学习提供了一系列结果。 具体地说, 我们研究具有共享重量的浅层二层自动编码器结构。 我们侧重于统计机器学习中常见数据的三种基因模型:(一) 双层自动编码器的混合模型,(二) 稀少的编码模型,(三) 带有非负系数的宽度模型。 对于其中每一种模型, 我们证明, 在对超参数、建筑和初始化的适当选择下, 梯度下沉学的自动编码器可以成功地恢复相应模型的参数。 据我们所知, 这是严格研究权重共享自动编码器的梯度下降动态的第一个结果。 我们的分析可以被视为理论证据, 浅层自动编码器模块确实可以用作各种数据模型的特征学习机制, 并可能揭示如何用自动编码器作为基本建筑块来培训更大的堆积结构。