Adversarial examples of deep neural networks are receiving ever increasing attention because they help in understanding and reducing the sensitivity to their input. This is natural given the increasing applications of deep neural networks in our everyday lives. When white-box attacks are almost always successful, it is typically only the distortion of the perturbations that matters in their evaluation. In this work, we argue that speed is important as well, especially when considering that fast attacks are required by adversarial training. Given more time, iterative methods can always find better solutions. We investigate this speed-distortion trade-off in some depth and introduce a new attack called boundary projection (BP) that improves upon existing methods by a large margin. Our key idea is that the classification boundary is a manifold in the image space: we therefore quickly reach the boundary and then optimize distortion on this manifold.
翻译:深神经网络的反向例子越来越受到越来越多的关注,因为它们有助于理解和减少对其投入的敏感性。这是自然的,因为深神经网络在日常生活中的应用越来越多。当白箱袭击几乎总是成功时,其评价中通常只是干扰的扭曲。在这项工作中,我们认为速度也很重要,特别是考虑到对抗性训练需要快速攻击时。由于时间越多,迭代方法总是能找到更好的解决办法。我们深入地调查这种速度扭曲的权衡,并引进一种称为边界投射的新攻击,大大改进了现有方法。我们的关键思想是,分类边界是图像空间的方块:因此,我们很快到达边界,然后优化这一方块的扭曲。