学会未学习：对抗样本增强抑制无法学习的示例攻击 (Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks)

Unlearnable example attacks are data poisoning techniques that can be used to safeguard public data against unauthorized use for training deep learning models. These methods add stealthy perturbations to the original image, thereby making it difficult for deep learning models to learn from these training data effectively. Current research suggests that adversarial training can, to a certain degree, mitigate the impact of unlearnable example attacks, while common data augmentation methods are not effective against such poisons. Adversarial training, however, demands considerable computational resources and can result in non-trivial accuracy loss. In this paper, we introduce the UEraser method, which outperforms current defenses against different types of state-of-the-art unlearnable example attacks through a combination of effective data augmentation policies and loss-maximizing adversarial augmentations. In stark contrast to the current SOTA adversarial training methods, UEraser uses adversarial augmentations, which extends beyond the confines of $ \ell_p $ perturbation budget assumed by current unlearning attacks and defenses. It also helps to improve the model's generalization ability, thus protecting against accuracy loss. UEraser wipes out the unlearning effect with error-maximizing data augmentations, thus restoring trained model accuracies. Interestingly, UEraser-Lite, a fast variant without adversarial augmentations, is also highly effective in preserving clean accuracies. On challenging unlearnable CIFAR-10, CIFAR-100, SVHN, and ImageNet-subset datasets produced with various attacks, it achieves results that are comparable to those obtained during clean training. We also demonstrate its efficacy against possible adaptive attacks. Our code is open source and available to the deep learning community: https://github.com/lafeat/ueraser.

翻译：无法学习的示例攻击是一种数据毒化技术，可以用来保护公共数据，防止被未授权使用来训练深度学习模型。这些方法向原始图像中添加隐秘扰动，使得深度学习模型难以有效地从这些训练数据中学习。目前的研究表明，对抗训练在一定程度上可以缓解无法学习的示例攻击的影响，而常见的数据增强方法对此类毒物无效。然而，对抗训练需要大量的计算资源，并可能导致非常重要的准确性损失。在本文中，我们介绍了UEraser方法，它通过有效的数据增强策略和损失最大化的对抗性增强，优于当前对不同类型的最新无法学习的示例攻击的防御。与目前的 SOTA 对抗训练方法完全不同，UEraser 使用对抗增强，其超出了当前对无法学习的攻击和防御所假设的$ \ell_p$扰动预算的范围。它还有助于提高模型的泛化能力，从而防止准确性损失。UEraser 使用错误最大化的数据增强抹除了无法学习的效果，从而恢复了经过训练的模型准确性。有趣的是，不带对抗增强的快速变体 UErase-Lite 在保护干净的准确性方面也非常有效。在使用各种攻击产生的具有挑战性的无法学习 CIFAR-10、CIFAR-100、SVHN 和图像子集数据集上，它取得了与干净训练相当的结果。我们还展示了它对可能的自适应攻击的功效。我们的代码是开源的，可供深度学习社区使用：https://github.com/lafeat/ueraser。