Adding perturbations to images can mislead classification models to produce incorrect results. Recently, researchers exploited adversarial perturbations to protect image privacy from retrieval by intelligent models. However, adding adversarial perturbations to images destroys the original data, making images useless in digital forensics and other fields. To prevent illegal or unauthorized access to sensitive image data such as human faces without impeding legitimate users, the use of reversible adversarial attack techniques is increasing. The original image can be recovered from its reversible adversarial examples. However, existing reversible adversarial attack methods are designed for traditional imperceptible adversarial perturbations and ignore the local visible adversarial perturbation. In this paper, we propose a new method for generating reversible adversarial examples based on local visible adversarial perturbation. The information needed for image recovery is embedded into the area beyond the adversarial patch by the reversible data hiding technique. To reduce image distortion, lossless compression and the B-R-G (bluered-green) embedding principle are adopted. Experiments on CIFAR-10 and ImageNet datasets show that the proposed method can restore the original images error-free while ensuring good attack performance.
翻译:添加图像扰动可能会误导分类模型以得出不正确的结果。 最近,研究人员利用对抗性扰动来保护图像隐私,使其不受智能模型检索的干扰。 但是,在图像中添加对抗性扰动会破坏原始数据,使图像在数字法学和其他领域毫无用处。 为了防止非法或未经授权获取敏感图像数据,如人脸,同时又不妨碍合法用户,使用可逆对抗攻击技术的情况正在增加。 原始图像可以从可逆的对抗性攻击实例中恢复。 但是,现有的可逆性对抗性攻击方法是为传统的不可察觉性对抗性干扰设计的,并且忽略了本地可见的对抗性扰动。 在本文中,我们提出了一个基于本地可见的对抗性扰动的生成可逆性对抗性实例的新方法。 图像恢复所需的信息被嵌入了可逆数据隐藏技术的对抗性补丁区之外的区域。 减少图像扭曲、无损压缩和B-R-G(bread- g) 嵌入原则。 采用了对 CIRFAR- 10的实验和图像网络数据设置了良好的原始图像功能,同时显示可恢复原始图像的功能错误。