In recent years, Deep Neural Networks (DNNs) have had a dramatic impact on a variety of problems that were long considered very difficult, e. g., image classification and automatic language translation to name just a few. The accuracy of modern DNNs in classification tasks is remarkable indeed. At the same time, attackers have devised powerful methods to construct specially-crafted malicious inputs (often referred to as adversarial examples) that can trick DNNs into mis-classifying them. What is worse is that despite the many defense mechanisms proposed to protect DNNs against adversarial attacks, attackers are often able to circumvent these defenses, rendering them useless. This state of affairs is extremely worrying, especially since machine learning systems get adopted at scale. In this paper, we propose a scientific evaluation methodology aimed at assessing the quality, efficacy, robustness and efficiency of randomized defenses to protect DNNs against adversarial examples. Using this methodology, we evaluate a variety of defense mechanisms. In addition, we also propose a defense mechanism we call Randomly Perturbed Ensemble Neural Networks (RPENNs). We provide a thorough and comprehensive evaluation of the considered defense mechanisms against a white-box attacker model, six different adversarial attack methods and using the ILSVRC2012 validation data set.
翻译:近年来,深神经网络(DNN)对长期被认为非常困难的各种问题产生了巨大影响,例如图像分类和自动语言翻译等,这些问题长期被认为是非常困难的问题。现代DNN在分类任务中的准确性确实非常显著。与此同时,攻击者设计了强大的方法来构建特别制造的恶意投入(通常称为对抗性例子),可以诱使DNN将其错误分类。更糟糕的是,尽管提出了许多保护DNN不受对抗性攻击的防御机制,但攻击者往往能够绕过这些防御,使其失去效用。这种情况令人极为担忧,特别是机器学习系统被大规模采用之后。我们在本文件中提出了科学评价方法,旨在评估随机防卫的质量、效力、稳健和效率,以保护DNNN不受对抗性例子的影响。我们使用这种方法评估了各种防御机制。此外,我们还提议了一种防御机制,我们称之为随机穿孔的Ensemble网络(REPNN),使其失去作用。这种状态令人极为担忧,特别是因为机器学习系统被大规模采用。我们提出了一个科学评估方法,用以评估I-LS的六种防御模式,用来对付不同的白箱式攻击。