防御深心神经网络的后门攻击 (Defending against Backdoor Attack on Deep Neural Networks)

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called \textit{backdoor attack}, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the $\ell_\infty$ norm of an activation map compared to its $\ell_1$ and $\ell_2$ norm. Spurred by our results, we propose the \textit{$\ell_\infty$-based neuron pruning} to remove the backdoor from the backdoored DNN. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.

翻译：虽然深层神经网络(DNN)在各种计算机视觉任务中取得了巨大成功,但最近发现它们很容易受到对抗性攻击。在本文中,我们关注所谓的“textit{backdoor attack} ”,这为一小部分培训数据(也称为数据中毒)注入了后门触发器,使受过训练的DNN在面对触发器的例子时导致分类错误。具体地说,我们仔细研究真实和合成后门攻击对香草和后门DNNN的内部反应的影响,通过Gard-CAM的镜头。此外,我们表明后门攻击在激活神经能力方面产生了严重的偏差,其值为$\ell ⁇ /infty$/美元,而其标准值为$_1美元和$\ell_2美元。我们根据我们的结果,我们建议用\text{$@ell_infty$基础神经运行}来仔细研究真实和合成后门攻击对香草和后门神经运行的影响,以便从后门DNNN。实验表明,我们的方法可以有效地降低攻击成功率,并且保持高等级。

相关内容

Neural Networks

关注 0

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【KDD2020】更深的图神经网络，Towards Deeper Graph Neural Networks

专知会员服务

90+阅读 · 2020年7月22日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日