We present a method for gating deep-learning architectures on a fine-grained level. Individual convolutional maps are turned on/off conditionally on features in the network. This method allows us to train neural networks with a large capacity, but lower inference time than the full network. To achieve this, we introduce a new residual block architecture that gates convolutional channels in a fine-grained manner. We also introduce a generally applicable tool "batch-shaping" that matches the marginal aggregate posteriors of features in a neural network to a pre-specified prior distribution. We use this novel technique to force gates to be more conditional on the data. We present results on CIFAR-10 and ImageNet datasets for image classification and Cityscapes for semantic segmentation. Our results show that our method can slim down large architectures conditionally, such that the average computational cost on the data is on par with a smaller architecture, but with higher accuracy. In particular, our ResNet34 gated network achieves a performance of 72.55% top-1 accuracy compared to the 69.76% accuracy of the baseline ResNet18 model, for similar complexity. We also show that the resulting networks automatically learn to use more features for difficult examples and fewer features for simple examples.
翻译:为了做到这一点,我们引入了一种以精细加分层制成的深层学习结构的方法。 个别的卷进图以网络的功能为条件, 以网络的功能为条件打开/ 关闭。 这种方法让我们能够对具有较大容量的神经网络进行神经网络培训, 但比整个网络的推断时间要低。 为了实现这一点, 我们引入了一个新的剩余区块结构, 以精细加分层的方式将卷进渠道。 我们还引入了一种普遍适用的工具“ 批发 ”, 它将神经网络的特征的边际集合后遗迹与预先指定的先前分布相匹配。 我们使用这种新技术来迫使门更加以数据为条件。 我们在 CIRFAR- 10 和图像网络数据集上展示了结果, 用于图像分类和语义分解的市景象。 我们的结果表明, 我们的方法可以有条件地缩小大结构, 使数据的平均计算成本与较小结构相同, 但准确度更高。 特别是, 我们的ResNet34 门网的功能为72. 55 %, 与69.76% 的基点精确度为69.76%, 的ResNet 模型的精确度要用更难的模型, 。