Hashing images with a perceptual algorithm is a common approach to solving duplicate image detection problems. However, perceptual image hashing algorithms are differentiable, and are thus vulnerable to gradient-based adversarial attacks. We demonstrate that not only is it possible to modify an image to produce an unrelated hash, but an exact image hash collision between a source and target image can be produced via minuscule adversarial perturbations. In a white box setting, these collisions can be replicated across nearly every image pair and hash type (including both deep and non-learned hashes). Furthermore, by attacking points other than the output of a hashing function, an attacker can avoid having to know the details of a particular algorithm, resulting in collisions that transfer across different hash sizes or model architectures. Using these techniques, an adversary can poison the image lookup table of a duplicate image detection service, resulting in undefined or unwanted behavior. Finally, we offer several potential mitigations to gradient-based image hash attacks.
翻译:使用感知算法的散列图像是解决重复图像探测问题的一种常见方法。 但是, 感知图像散列算法是可以区分的, 因而很容易受到基于梯度的对抗性攻击。 我们证明, 不仅有可能修改图像以产生一个无关的散列, 而且源和目标图像之间准确的成像散列碰撞可以通过微缩对冲扰动生成。 在白箱设置中, 这些碰撞可以复制到几乎每个图像配对和散列类型( 包括深度和非广度的散列) 。 此外, 攻击者可以通过攻击非散列函数输出的点, 避免了解特定算法的细节, 导致碰撞, 从而转移到不同的散列大小或模型结构 。 使用这些技术, 对手可以毒死重复图像探测服务的图像外观表, 导致不确定或不想要的行为 。 最后, 我们为基于梯度的图像攻击提供了几种可能的缓解方法 。