Global Average Pooling (GAP) is used by default on the channel-wise attention mechanism to extract channel descriptors. However, the simple global aggregation method of GAP is easy to make the channel descriptors have homogeneity, which weakens the detail distinction between feature maps, thus affecting the performance of the attention mechanism. In this work, we propose a novel method for channel-wise attention network, called Stochastic Region Pooling (SRP), which makes the channel descriptors more representative and diversity by encouraging the feature map to have more or wider important feature responses. Also, SRP is the general method for the attention mechanisms without any additional parameters or computation. It can be widely applied to attention networks without modifying the network structure. Experimental results on image recognition datasets including CIAFR-10/100, ImageNet and three Fine-grained datasets (CUB-200-2011, Stanford Cars and Stanford Dogs) show that SRP brings the significant improvements of the performance over efficient CNNs and achieves the state-of-the-art results.
翻译:全球平均集合(GAP)默认地用于频道关注机制,以提取频道描述器。然而,GAP的简单全球汇总方法很容易使频道描述器具有同质性,从而削弱地貌地图之间的详细区分,从而影响关注机制的性能。在这项工作中,我们提议了一种新颖的频道关注网络方法,称为Stochastic区域集合(SRP),它通过鼓励地貌地图具有更多或更广泛的重要特征响应,使频道描述器更具代表性和多样性。此外,SRP是关注机制的一般方法,没有额外的参数或计算。它可以广泛应用于关注网络,而不改变网络结构。关于图像识别数据集的实验结果,包括CIAFR-10/100、图像网络和三套精细的数据集(CUB-200-2011、斯坦福卡和斯坦福狗)显示,SRP将业绩的显著改进带过高效的CNNIS,并实现最新结果。