防范Trojan袭击的适应性黑箱防御(TrojDef) (An Adaptive Black-box Defense against Trojan Attacks (TrojDef)) - 专知论文

会员服务 ·

0

置信度 · 黑盒 · Extensibility · Analysis · 可辨认的 ·

2022 年 9 月 5 日

An Adaptive Black-box Defense against Trojan Attacks (TrojDef)

翻译：防范Trojan袭击的适应性黑箱防御(TrojDef)

Guanxiong Liu,Abdallah Khreishah,Fatima Sharadgah,Issa Khalil

Trojan backdoor is a poisoning attack against Neural Network (NN) classifiers in which adversaries try to exploit the (highly desirable) model reuse property to implant Trojans into model parameters for backdoor breaches through a poisoned training process. Most of the proposed defenses against Trojan attacks assume a white-box setup, in which the defender either has access to the inner state of NN or is able to run back-propagation through it. In this work, we propose a more practical black-box defense, dubbed TrojDef, which can only run forward-pass of the NN. TrojDef tries to identify and filter out Trojan inputs (i.e., inputs augmented with the Trojan trigger) by monitoring the changes in the prediction confidence when the input is repeatedly perturbed by random noise. We derive a function based on the prediction outputs which is called the prediction confidence bound to decide whether the input example is Trojan or not. The intuition is that Trojan inputs are more stable as the misclassification only depends on the trigger, while benign inputs will suffer when augmented with noise due to the perturbation of the classification features. Through mathematical analysis, we show that if the attacker is perfect in injecting the backdoor, the Trojan infected model will be trained to learn the appropriate prediction confidence bound, which is used to distinguish Trojan and benign inputs under arbitrary perturbations. However, because the attacker might not be perfect in injecting the backdoor, we introduce a nonlinear transform to the prediction confidence bound to improve the detection accuracy in practical settings. Extensive empirical evaluations show that TrojDef significantly outperforms the-state-of-the-art defenses and is highly stable under different settings, even when the classifier architecture, the training process, or the hyper-parameters change.

翻译：TrojDef试图通过监测预测信心的变化, 当输入反复受到随机噪音的干扰时, 大部分针对Trojan袭击的拟议防御假设是白箱设置, 捍卫者要么可以接触到NNN的内部状态, 要么能够通过它进行反向分析。在这项工作中, 我们提出一个更实用的黑箱防御, 称为TrojDef, 只能追溯NN的准确性。 TrojDef试图通过检测来识别和过滤Trojan的预测输入( 即, 投入随着Trojan的触发而增加) 。当输入反复受到随机噪音的干扰时, 大部分针对Trojan袭击的拟议防御假定是白箱设置。我们根据预测输出产生一个函数, 即需要预测信心约束来决定输入是否是Trojan。我们的直觉是, 木座输入更加精确, 因为错误的分类只能依靠NNNE。 Trojural的精确度变化, 而良性输入将随着攻击的噪音的增加而受到影响。

0

相关内容

置信度

【AAAI 2022】IBM Research《对抗性机器学习AdvML》最新教程（附slides与video）

【AAAI 2022】IBM Research《对抗性机器学习AdvML》最新教程（附slides与video）

专知会员服务

39+阅读 · 2022年3月18日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Pt/TiMxOy/Pt/Si界面调控及忆阻行为调制机理

国家自然科学基金

0+阅读 · 2012年12月31日

前馈非线性系统的饱和时滞设计

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

硫化银/贵金属异质结构的制备与增强的光催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿LaNiO3 外延薄膜中结构耦合的金属-绝缘体转变的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型超声微泡介导靶向Survivin基因siRNA治疗原发性肝细胞癌

国家自然科学基金

0+阅读 · 2011年12月31日

在光波导与量子器件中应用非线性与量子调控相互作用进行信息处理

国家自然科学基金

0+阅读 · 2011年12月31日

RNA干扰沉默糖原合成激酶3β23545;tau蛋白磷酸化的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks

Arxiv

0+阅读 · 2022年10月19日

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

Arxiv

0+阅读 · 2022年10月18日

Towards Fair Classification against Poisoning Attacks

Arxiv

0+阅读 · 2022年10月18日

A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting

Arxiv

0+阅读 · 2022年10月16日

Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Arxiv

0+阅读 · 2022年10月16日

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

Arxiv

0+阅读 · 2022年10月14日

Demystifying Self-supervised Trojan Attacks

Arxiv

0+阅读 · 2022年10月13日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI 2022】IBM Research《对抗性机器学习AdvML》最新教程（附slides与video）

【AAAI 2022】IBM Research《对抗性机器学习AdvML》最新教程（附slides与video）

专知会员服务

39+阅读 · 2022年3月18日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】计算受限的持续学习：基础与算法

生成式人工智能时代的多目标推荐：最新进展与未来展望综述

AI大模型技术在电力系统中的应用及发展趋势

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks

Arxiv

0+阅读 · 2022年10月19日

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

Arxiv

0+阅读 · 2022年10月18日

Towards Fair Classification against Poisoning Attacks

Arxiv

0+阅读 · 2022年10月18日

A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting

Arxiv

0+阅读 · 2022年10月16日

Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples

Arxiv

0+阅读 · 2022年10月16日

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

Arxiv

0+阅读 · 2022年10月14日

Demystifying Self-supervised Trojan Attacks

Arxiv

0+阅读 · 2022年10月13日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

相关基金

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Pt/TiMxOy/Pt/Si界面调控及忆阻行为调制机理

国家自然科学基金

0+阅读 · 2012年12月31日

前馈非线性系统的饱和时滞设计

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

硫化银/贵金属异质结构的制备与增强的光催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿LaNiO3 外延薄膜中结构耦合的金属-绝缘体转变的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型超声微泡介导靶向Survivin基因siRNA治疗原发性肝细胞癌

国家自然科学基金

0+阅读 · 2011年12月31日

在光波导与量子器件中应用非线性与量子调控相互作用进行信息处理

国家自然科学基金

0+阅读 · 2011年12月31日

RNA干扰沉默糖原合成激酶3β23545;tau蛋白磷酸化的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员