Dropout is commonly used to help reduce overfitting in deep neural networks. Sparsity is a potentially important property of neural networks, but is not explicitly controlled by Dropout-based regularization. In this work, we propose Sparseout a simple and efficient variant of Dropout that can be used to control the sparsity of the activations in a neural network. We theoretically prove that Sparseout is equivalent to an $L_q$ penalty on the features of a generalized linear model and that Dropout is a special case of Sparseout for neural networks. We empirically demonstrate that Sparseout is computationally inexpensive and is able to control the desired level of sparsity in the activations. We evaluated Sparseout on image classification and language modelling tasks to see the effect of sparsity on these tasks. We found that sparsity of the activations is favorable for language modelling performance while image classification benefits from denser activations. Sparseout provides a way to investigate sparsity in state-of-the-art deep learning models. Source code for Sparseout could be found at \url{https://github.com/najeebkhan/sparseout}.
翻译:疏漏通常用于帮助减少深层神经网络的过度配制。 纯度是神经网络的一个潜在重要属性, 但却不是由基于辍学的正规化来明确控制。 在这项工作中, 我们建议 Sprassout 简单而高效的脱漏变量, 用于控制神经网络中激活的松散效应。 我们理论上证明, 脱漏相当于对普遍线性模型特性的L_ q美元罚款, 以及 流出是神经网络的稀释的特殊案例。 我们从经验上表明, 流出在计算上是廉价的, 并且能够控制激活中所需的松散程度 。 我们在图像分类和语言建模任务中评估了松散效应。 我们发现, 激活的松散对于语言建模效果是有利的, 而图像分类则有利于较稠密的激活。 疏松散提供了一种方法, 调查最先进的深层学习模型中的松散现象。 可以在\ urlhttp://gibrath. /naoff 找到斯帕鲁特的源代码 。