Dataset distillation provides an effective approach to reduce memory and computational costs by optimizing a compact dataset that achieves performance comparable to the full original. However, for large-scale datasets and complex deep networks (e.g., ImageNet-1K with ResNet-101), the vast optimization space hinders distillation effectiveness, limiting practical applications. Recent methods leverage pre-trained diffusion models to directly generate informative images, thereby bypassing pixel-level optimization and achieving promising results. Nonetheless, these approaches often suffer from distribution shifts between the pre-trained diffusion prior and target datasets, as well as the need for multiple distillation steps under varying settings. To overcome these challenges, we propose a novel framework that is orthogonal to existing diffusion-based distillation techniques by utilizing the diffusion prior for patch selection rather than generation. Our method predicts noise from the diffusion model conditioned on input images and optional text prompts (with or without label information), and computes the associated loss for each image-patch pair. Based on the loss differences, we identify distinctive regions within the original images. Furthermore, we apply intra-class clustering and ranking on the selected patches to enforce diversity constraints. This streamlined pipeline enables a one-step distillation process. Extensive experiments demonstrate that our approach consistently outperforms state-of-the-art methods across various metrics and settings.
翻译:数据集蒸馏通过优化一个紧凑的数据集,使其达到与完整原始数据集相当的性能,从而有效降低内存和计算成本。然而,对于大规模数据集和复杂深度网络(例如,ImageNet-1K与ResNet-101),庞大的优化空间阻碍了蒸馏效果,限制了实际应用。现有方法利用预训练的扩散模型直接生成信息丰富的图像,从而绕过像素级优化并取得了有希望的结果。尽管如此,这些方法通常面临预训练扩散先验与目标数据集之间的分布偏移,以及需要在不同设置下进行多次蒸馏步骤的问题。为克服这些挑战,我们提出了一种新颖的框架,该框架与现有基于扩散的蒸馏技术正交,通过利用扩散先验进行补丁选择而非生成。我们的方法基于输入图像和可选的文本提示(带或不带标签信息)预测扩散模型的噪声,并计算每个图像-补丁对的相关损失。根据损失差异,我们识别原始图像中的显著区域。此外,我们对所选补丁应用类内聚类和排序以施加多样性约束。这一简化的流程实现了一步蒸馏过程。大量实验表明,我们的方法在各种指标和设置下均一致优于现有最先进方法。