Existing methods for deepfake detection aim to develop generalizable detectors. Although "generalizable" is the ultimate target once and for all, with limited training forgeries and domains, it appears idealistic to expect generalization that covers entirely unseen variations, especially given the diversity of real-world deepfakes. Therefore, introducing large-scale multi-domain data for training can be feasible and important for real-world applications. However, within such a multi-domain scenario, the differences between multiple domains, rather than the subtle real/fake distinctions, dominate the feature space. As a result, despite detectors being able to relatively separate real and fake within each domain (i.e., high AUC), they struggle with single-image real/fake judgments in domain-unspecified conditions (i.e., low ACC). In this paper, we first define a new research paradigm named Multi-In-Domain Face Forgery Detection (MID-FFD), which includes sufficient volumes of real-fake domains for training. Then, the detector should provide definitive real-fake judgments to the domain-unspecified inputs, which simulate the frame-by-frame independent detection scenario in the real world. Meanwhile, to address the domain-dominant issue, we propose a model-agnostic framework termed DevDet (Developer for Detector) to amplify real/fake differences and make them dominant in the feature space. DevDet consists of a Face Forgery Developer (FFDev) and a Dose-Adaptive detector Fine-Tuning strategy (DAFT). Experiments demonstrate our superiority in predicting real-fake under the MID-FFD scenario while maintaining original generalization ability to unseen data.
翻译:现有的深度伪造检测方法旨在开发泛化性强的检测器。尽管“泛化性”是最终目标,但在训练伪造样本和领域有限的情况下,期望检测器能够完全覆盖未见过的变异类型显得过于理想化,尤其是在现实世界中深度伪造技术具有高度多样性的背景下。因此,引入大规模多领域数据进行训练对于实际应用具有可行性与重要性。然而,在这种多领域场景中,特征空间往往由不同领域间的差异主导,而非真实与伪造之间的细微区别。这导致检测器虽能在各领域内相对区分真实与伪造样本(即获得高AUC值),但在领域未指定的条件下进行单张图像真伪判断时表现不佳(即获得低ACC值)。本文首先定义了一种名为“多域内人脸伪造检测”的新研究范式,该范式包含足量的真实与伪造领域训练数据。随后,检测器需对领域未指定的输入样本提供明确真伪判断,以模拟真实世界中逐帧独立检测的场景。同时,为应对领域主导问题,我们提出了一种模型无关框架——DevDet,通过放大真伪差异使其在特征空间中占据主导地位。DevDet由人脸伪造开发器与剂量自适应检测器微调策略两部分构成。实验表明,在MID-FFD场景下,我们的方法在保持对未见数据原有泛化能力的同时,显著提升了真伪预测性能。