Defense models against adversarial attacks have grown significantly, but the lack of practical evaluation methods has hindered progress. Evaluation can be defined as looking for defense models' lower bound of robustness given a budget number of iterations and a test dataset. A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable (i.e., approaching the lower bound of robustness). Towards this target, we propose a parameter-free Adaptive Auto Attack (A$^3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion. Specifically, by observing that adversarial examples to a specific defense model follow some regularities in their starting points, we design an Adaptive Direction Initialization strategy to speed up the evaluation. Furthermore, to approach the lower bound of robustness under the budget number of iterations, we propose an online statistics-based discarding strategy that automatically identifies and abandons hard-to-attack images. Extensive experiments demonstrate the effectiveness of our A$^3$. Particularly, we apply A$^3$ to nearly 50 widely-used defense models. By consuming much fewer iterations than existing methods, i.e., $1/10$ on average (10$\times$ speed up), we achieve lower robust accuracy in all cases. Notably, we won $\textbf{first place}$ out of 1681 teams in CVPR 2021 White-box Adversarial Attacks on Defense Models competitions with this method. Code is available at: $\href{https://github.com/liuye6666/adaptive_auto_attack}{https://github.com/liuye6666/adaptive\_auto\_attack}$
翻译:对抗敌对式攻击的防御模式有了显著的发展,但是缺乏实用的评价方法阻碍了进展。评价可以定义为寻找防御模式的相对稳健度较低的约束,因为预算的迭代数和测试数据集数很多。一个实际的评价方法应该方便(即无参数)、高效(即循环减少)和可靠(即接近强力的较低约束)。为了实现这一目标,我们提议了一个无参数的适应性自动攻击(A$66美元)的评价方法,该方法用测试时间训练的方式解决效率和可靠性问题。具体来说,通过观察特定防御模式的对抗性例子遵循其起始点的某些规律,我们设计了一个适应性指导初始化战略。此外,为了接近预算数下较弱的稳健性约束,我们提议一个基于在线统计的抛弃战略,自动识别和放弃难于攻击的图像。广泛的实验表明我们A$3美元(尤其是我们用A3美元)到近50美元(即高价)的具体防御模式的对抗性例子。在目前16美元的平均防御模式中,我们用不到这个方法。