防止利用自我扩大和兼容性进行数据中毒的可行保障措施 (Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility) - 专知论文

会员服务 ·

0

可辨认的 · 数据集 · 模型评估 · CIFAR-10 · MoDELS ·

2022 年 10 月 4 日

Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility

翻译：防止利用自我扩大和兼容性进行数据中毒的可行保障措施

Charles Jin,Melinda Sun,Martin Rinard

As deep learning datasets grow larger and less curated, backdoor data poisoning attacks, which inject malicious poisoned data into the training dataset, have drawn increasing attention in both academia and industry. We identify an incompatibility property of the interaction of clean and poisoned data with the training algorithm, specifically that including poisoned data in the training dataset does not improve model accuracy on clean data and vice-versa. Leveraging this property, we develop an algorithm that iteratively refines subsets of the poisoned dataset to obtain subsets that concentrate around either clean or poisoned data. The result is a partition of the original dataset into disjoint subsets, for each of which we train a corresponding model. A voting algorithm over these models identifies the clean data within the larger poisoned dataset. We empirically evaluate our approach and technique for image classification tasks over the GTSRB and CIFAR-10 datasets. The experimental results show that prior dirty-label and clean-label backdoor attacks in the literature produce poisoned datasets that exhibit behavior consistent with the incompatibility property. The results also show that our defense reduces the attack success rate below 1% on 134 out of 165 scenarios in this setting, with only a 2% drop in clean accuracy on CIFAR-10 (and negligible impact on GTSRB).

翻译：随着深层学习数据集的扩大和缩小范围,将恶意有毒数据输入培训数据集的后门数据中毒袭击日益引起学术界和行业的注意。我们发现清洁和有毒数据与培训算法相互作用的不相容性,特别是将有毒数据纳入培训数据集并不能提高清洁数据和反向数据的模型准确性。利用这一属性,我们开发了一种对有毒数据集子集进行迭接精化的算法,以获取集中在清洁或有毒数据的子集。结果是将原始数据集分成不连接子集,供我们训练相应的模型。对这些模型的表决算法确定了在较大有毒数据集中的清洁数据。我们用经验评估了我们在GTSRB和CIFAR-10数据集中进行图像分类任务的方法和技术。实验结果显示,文献中先前的脏标签和清洁标签后门攻击产生了与不相容特性相符的中毒数据集。结果还显示,我们的防御把攻击成功率降低到134个攻击率以下,在165个TRA的假设中确定了清洁度,只有2 %。

0

相关内容

可辨认的

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

遗传性血管水肿（HAE）临床异质性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于高通量组学方法研究Aurora-A对肿瘤干细胞干性的调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

相依样本下的经验似然推断

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化酶复合物COMPASS催化的H3K4me2,H3K4me3对果蝇发育调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化应激诱导的G2/M期阻滞中HSP90对26S蛋白酶体的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

环境诱导家蚕滞育的CREB调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

孕烷X受体介导的CYP3A4基因转录调控的表观遗传分子机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A new BART prior for flexible modeling with categorical predictors

A new BART prior for flexible modeling with categorical predictors

Arxiv

0+阅读 · 2022年11月8日

A multivariate functional-data mixture model for spatio-temporal data: inference and cokriging

Arxiv

0+阅读 · 2022年11月8日

Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies

Arxiv

0+阅读 · 2022年11月7日

Provable and Efficient Continual Representation Learning

Arxiv

0+阅读 · 2022年11月7日

Black-Box Attack against GAN-Generated Image Detector with Contrastive Perturbation

Arxiv

0+阅读 · 2022年11月7日

Resilience of Wireless Ad Hoc Federated Learning against Model Poisoning Attacks

Arxiv

0+阅读 · 2022年11月7日

NIP: Neuron-level Inverse Perturbation Against Adversarial Attacks

Arxiv

0+阅读 · 2022年11月7日

Black-box Coreset Variational Inference

Arxiv

0+阅读 · 2022年11月4日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《具备集体态势感知能力的深度强化学习智能体在超视距空战中的应用研究》最新文献

《美军条令文件：频谱管理操作技术》2025最新100页

反制小型无人机：一项重大挑战

《AI作战：将人机协作集成至实时、虚拟与建构环境（LVC）的建模与仿真》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A new BART prior for flexible modeling with categorical predictors

A new BART prior for flexible modeling with categorical predictors

Arxiv

0+阅读 · 2022年11月8日

A multivariate functional-data mixture model for spatio-temporal data: inference and cokriging

Arxiv

0+阅读 · 2022年11月8日

Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies

Arxiv

0+阅读 · 2022年11月7日

Provable and Efficient Continual Representation Learning

Arxiv

0+阅读 · 2022年11月7日

Black-Box Attack against GAN-Generated Image Detector with Contrastive Perturbation

Arxiv

0+阅读 · 2022年11月7日

Resilience of Wireless Ad Hoc Federated Learning against Model Poisoning Attacks

Arxiv

0+阅读 · 2022年11月7日

NIP: Neuron-level Inverse Perturbation Against Adversarial Attacks

Arxiv

0+阅读 · 2022年11月7日

Black-box Coreset Variational Inference

Arxiv

0+阅读 · 2022年11月4日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

相关基金

遗传性血管水肿（HAE）临床异质性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于高通量组学方法研究Aurora-A对肿瘤干细胞干性的调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

相依样本下的经验似然推断

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化酶复合物COMPASS催化的H3K4me2,H3K4me3对果蝇发育调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化应激诱导的G2/M期阻滞中HSP90对26S蛋白酶体的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

环境诱导家蚕滞育的CREB调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

孕烷X受体介导的CYP3A4基因转录调控的表观遗传分子机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员