Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight quantization and pruning are usually applied before they can be accommodated onthe edge. It is generally believed that quantization leads to performance degradation, and plenty of existingworks have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue thatquantization, which essentially imposes regularization on weight representations, can sometimes help toimprove accuracy. We conduct comprehensive experiments on three widely used applications: fully con-nected network (FCN) for biomedical image segmentation, convolutional neural network (CNN) for imageclassification on ImageNet, and recurrent neural network (RNN) for automatic speech recognition, and experi-mental results show that quantization can improve the accuracy by 1%, 1.95%, 4.23% on the three applicationsrespectively with 3.5x-6.4x memory reduction.
翻译:近年来,深神经网络(DNNS)显示出其巨大的潜力,超越了人类专家在广泛应用方面的人均水平。然而,由于其体积庞大,在边缘安置之前,通常会应用重量量化和剪裁等压缩技术。一般认为,四分化会导致性能退化,大量现有作品探索了量化战略,以达到最低精确度损失。在本文中,我们争辩说,主要要求对体重表现进行规范化的定量化有时会帮助提高准确性。我们就三种广泛应用进行了全面实验:完全连接的生物医学图像分割网络(FCN),图像网络图像分类的革命神经网络(CNN),以及自动语音识别的经常性神经网络(RNNN),以及外观结果显示,在三种应用中进行定量化的精确度可以提高1%、1.95%、4.23%的精确度,同时减少3.5x-6.4x的记忆。