The capability of generative diffusion models (DMs) like Stable Diffusion (SD) in replicating training data could be taken advantage of by attackers to launch the Copyright Infringement Attack, with duplicated poisoned image-text pairs. SilentBadDiffusion (SBD) is a method proposed recently, which shew outstanding performance in attacking SD in text-to-image tasks. However, the feasible data resources in this area are still limited, some of them are even constrained or prohibited due to the issues like copyright ownership or inappropriate contents; And not all of the images in current datasets are suitable for the proposed attacking methods; Besides, the state-of-the-art (SoTA) performance of SBD is far from ideal when few generated poisoning samples could be adopted for attacks. In this paper, we raised new datasets accessible for researching in attacks like SBD, and proposed Multi-Element (ME) attack method based on SBD by increasing the number of poisonous visual-text elements per poisoned sample to enhance the ability of attacking, while importing Discrete Cosine Transform (DCT) for the poisoned samples to maintain the stealthiness. The Copyright Infringement Rate (CIR) / First Attack Epoch (FAE) we got on the two new datasets were 16.78% / 39.50 and 51.20% / 23.60, respectively close to or even outperformed benchmark Pokemon and Mijourney datasets. In condition of low subsampling ratio (5%, 6 poisoned samples), MESI and DCT earned CIR / FAE of 0.23% / 84.00 and 12.73% / 65.50, both better than original SBD, which failed to attack at all.
翻译:生成式扩散模型(如Stable Diffusion)复制训练数据的能力可能被攻击者利用,通过复制的污染图文对发起版权侵权攻击。SilentBadDiffusion是近期提出的一种方法,在文本到图像任务中攻击Stable Diffusion表现出优异性能。然而,该领域可用的数据资源仍然有限,部分数据因版权归属或不适当内容等问题受到限制甚至禁止;且现有数据集中的图像并非全部适用于所提出的攻击方法;此外,当仅能采用少量生成污染样本进行攻击时,SBD的先进性能远未达到理想状态。本文提出了适用于SBD等攻击研究的新数据集,并在SBD基础上提出多元素攻击方法,通过增加每个污染样本中的污染视觉-文本元素数量以提升攻击能力,同时引入离散余弦变换处理污染样本以保持隐蔽性。我们在两个新数据集上获得的版权侵权率/首次攻击轮次分别为16.78%/39.50和51.20%/23.60,均接近甚至优于基准Pokemon和Mijourney数据集。在低子采样率条件下(5%,6个污染样本),多元素攻击与离散余弦变换组合方案获得的版权侵权率/首次攻击轮次为0.23%/84.00和12.73%/65.50,均优于原始SBD方法(后者完全未能实现攻击)。