Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model size and the intensive computation. To address this issue, various approximation techniques have been investigated, which seek for a light weighted network with little performance degradation in exchange of smaller model size or faster inference. Both low-rankness and sparsity are appealing properties for the network approximation. In this paper we propose a unified framework to compress the convolutional neural networks (CNNs) by combining these two properties, while taking the nonlinear activation into consideration. Each layer in the network is approximated by the sum of a structured sparse component and a low-rank component, which is formulated as an optimization problem. Then, an extended version of alternating direction method of multipliers (ADMM) with guaranteed convergence is presented to solve the relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet and GoogLeNet with large image classification datasets. The results outperform previous work in terms of accuracy degradation, compression rate and speedup ratio. The proposed method is able to remarkably compress the model (with up to 4.9x reduction of parameters) at a cost of little loss or without loss on accuracy.
翻译:深心神经网络(DNNS)在现实世界的各种应用(即图像分类)中取得了巨大成功。然而,由于模型规模大和计算密集,网络中的参数吨数限制了神经网络的效率。为了解决这一问题,已经调查了各种近似技术,这些技术寻求一个轻加权网络,而以较小的模型大小或更快的推论来交换,其性能降解很少;低级别和偏小都对网络的近似性能具有吸引力。在本文件中,我们提议了一个统一框架,通过将这两个属性结合,同时考虑非线性激活,压缩革命神经网络(CNNs)的效率。网络中的每个层都近似于结构性稀疏小部分和低级部分的总和,这是个优化问题。然后,提出了具有保证趋同性的交替性乘法(ADMMM)的扩展版本,以解决较宽松的优化问题。在VGG-16、AlexNet和GoogLeNet上进行了实验,并配有大型图像分类数据集。在准确性降解、压缩率和速度降低成本率方面,拟议采用不至稳定损失率的模型。