Image Forgery Localization (IFL) is a crucial task in image forensics, aimed at accurately identifying manipulated or tampered regions within an image at the pixel level. Existing methods typically generate a single deterministic localization map, which often lacks the precision and reliability required for high-stakes applications such as forensic analysis and security surveillance. To enhance the credibility of predictions and mitigate the risk of errors, we introduce an advanced Conditional Bernoulli Diffusion Model (CBDiff). Given a forged image, CBDiff generates multiple diverse and plausible localization maps, thereby offering a richer and more comprehensive representation of the forgery distribution. This approach addresses the uncertainty and variability inherent in tampered regions. Furthermore, CBDiff innovatively incorporates Bernoulli noise into the diffusion process to more faithfully reflect the inherent binary and sparse properties of forgery masks. Additionally, CBDiff introduces a Time-Step Cross-Attention (TSCAttention), which is specifically designed to leverage semantic feature guidance with temporal steps to improve manipulation detection. Extensive experiments on eight publicly benchmark datasets demonstrate that CBDiff significantly outperforms existing state-of-the-art methods, highlighting its strong potential for real-world deployment.
翻译:图像伪造定位(IFL)是图像取证中的关键任务,旨在像素级别精确识别图像中被篡改或伪造的区域。现有方法通常生成单一确定性定位图,往往缺乏取证分析和安全监控等高风险应用所需的精度与可靠性。为提升预测的可信度并降低误判风险,我们提出一种先进的条件伯努利扩散模型(CBDiff)。给定伪造图像,CBDiff能够生成多个多样化且合理的定位图,从而提供更丰富、更全面的伪造分布表征。该方法有效解决了篡改区域固有的不确定性与多变性问题。此外,CBDiff创新性地将伯努利噪声引入扩散过程,以更准确地反映伪造掩码固有的二元稀疏特性。同时,CBDiff提出时间步交叉注意力机制(TSCAttention),该机制专门设计用于结合时间步的语义特征引导来提升篡改检测性能。在八个公开基准数据集上的大量实验表明,CBDiff显著优于现有最先进方法,凸显了其在真实场景部署中的强大潜力。