Recent advances in generative image modeling have achieved visual realism sufficient to deceive human experts, yet their potential for privacy preserving data sharing remains insufficiently understood. A central obstacle is the absence of reliable memorization detection mechanisms, limited quantitative evaluation, and poor generalization of existing privacy auditing methods across domains. To address this, we propose to view memorization detection as a unified problem at the intersection of re-identification and copy detection, whose complementary goals cover both identity consistency and augmentation-robust duplication, and introduce Latent Contrastive Memorization Network (LCMem), a cross-domain model evaluated jointly on both tasks. LCMem achieves this through a two-stage training strategy that first learns identity consistency before incorporating augmentation-robust copy detection. Across six benchmark datasets, LCMem achieves improvements of up to 16 percentage points on re-identification and 30 percentage points on copy detection, enabling substantially more reliable memorization detection at scale. Our results show that existing privacy filters provide limited performance and robustness, highlighting the need for stronger protection mechanisms. We show that LCMem sets a new standard for cross-domain privacy auditing, offering reliable and scalable memorization detection. Code and model is publicly available at https://github.com/MischaD/LCMem.
翻译:生成式图像建模的最新进展已实现足以欺骗人类专家的视觉真实感,但其在隐私保护数据共享方面的潜力仍未得到充分理解。一个核心障碍在于缺乏可靠的记忆检测机制、有限的定量评估以及现有隐私审计方法在跨领域中的泛化能力不足。为解决此问题,我们提出将记忆检测视为重识别与复制检测交叉领域的统一问题,其互补目标涵盖身份一致性与增强鲁棒的重复检测,并引入潜在对比记忆网络(LCMem),这是一个在两项任务上联合评估的跨领域模型。LCMem通过两阶段训练策略实现这一目标:首先学习身份一致性,随后融入增强鲁棒的复制检测。在六个基准数据集上,LCMem在重识别任务中提升高达16个百分点,在复制检测任务中提升30个百分点,从而实现了大规模下显著更可靠的记忆检测。我们的结果表明,现有隐私过滤器性能与鲁棒性有限,突显了更强保护机制的必要性。我们证明LCMem为跨领域隐私审计设立了新标准,提供了可靠且可扩展的记忆检测。代码与模型已公开于https://github.com/MischaD/LCMem。