We investigate the adversarial robustness of CNNs from the perspective of channel-wise activations. By comparing \textit{non-robust} (normally trained) and \textit{robustified} (adversarially trained) models, we observe that adversarial training (AT) robustifies CNNs by aligning the channel-wise activations of adversarial data with those of their natural counterparts. However, the channels that are \textit{negatively-relevant} (NR) to predictions are still over-activated when processing adversarial data. Besides, we also observe that AT does not result in similar robustness for all classes. For the robust classes, channels with larger activation magnitudes are usually more \textit{positively-relevant} (PR) to predictions, but this alignment does not hold for the non-robust classes. Given these observations, we hypothesize that suppressing NR channels and aligning PR ones with their relevances further enhances the robustness of CNNs under AT. To examine this hypothesis, we introduce a novel mechanism, i.e., \underline{C}hannel-wise \underline{I}mportance-based \underline{F}eature \underline{S}election (CIFS). The CIFS manipulates channels' activations of certain layers by generating non-negative multipliers to these channels based on their relevances to predictions. Extensive experiments on benchmark datasets including CIFAR10 and SVHN clearly verify the hypothesis and CIFS's effectiveness of robustifying CNNs. \url{https://github.com/HanshuYAN/CIFS}
翻译:我们从频道激活的角度来调查CNN的对抗性强度。 通过比较\ textit{ non-robust} (通常经过培训) 和\ textit{robtified} (对抗性受过培训) 模式, 我们观察到, 对抗性培训(AT) 使CNN 能够通过将频道驱动的对立数据与其自然对应的对等数据相匹配, 使CNN的对抗性强度得到加强。 然而, 当处理对立数据时, CNN 的预测渠道仍然过于活跃。 此外, 我们还观察到, AT 没有为所有类别带来类似的强度。 对于强势类来说, 较强的启动程度的频道通常比预测更强 \ textit{ 积极相关} (PR), 但这种对非对立性数据类的比对齐性。 然而, 我们低估了抑制NRR的频道和PR的关联性, 在ATFSFS的假设下, 我们引入了一个新的机制, i. i. credealality, 包括直径直径直流的直径直流的轨道 。