Diffusion models deliver high-fidelity synthesis but remain slow due to iterative sampling. We empirically observe there exists feature invariance in deterministic sampling, and present InvarDiff, a training-free acceleration method that exploits the relative temporal invariance across timestep-scale and layer-scale. From a few deterministic runs, we compute a per-timestep, per-layer, per-module binary cache plan matrix and use a re-sampling correction to avoid drift when consecutive caches occur. Using quantile-based change metrics, this matrix specifies which module at which step is reused rather than recomputed. The same invariance criterion is applied at the step scale to enable cross-timestep caching, deciding whether an entire step can reuse cached results. During inference, InvarDiff performs step-first and layer-wise caching guided by this matrix. When applied to DiT and FLUX, our approach reduces redundant compute while preserving fidelity. Experiments show that InvarDiff achieves $2$-$3\times$ end-to-end speed-ups with minimal impact on standard quality metrics. Qualitatively, we observe almost no degradation in visual quality compared with full computations.
翻译:扩散模型能够实现高保真度的合成,但由于迭代采样过程,其速度仍然较慢。我们通过实证观察发现,在确定性采样中存在特征不变性,并提出了InvarDiff,一种无需训练的加速方法,该方法利用了时间步尺度与层级尺度之间的相对时间不变性。通过少量确定性运行,我们计算出一个按时间步、按层级、按模块的二进制缓存计划矩阵,并采用重采样校正来避免连续缓存发生时产生的漂移。利用基于分位数的变化度量,该矩阵指定了在哪个步骤的哪个模块可以复用缓存结果而非重新计算。相同的不变性准则被应用于步骤尺度,以实现跨时间步缓存,从而判断整个步骤是否可以复用缓存结果。在推理过程中,InvarDiff依据该矩阵的指导,执行步骤优先和层级逐级的缓存。当应用于DiT和FLUX模型时,我们的方法在保持保真度的同时减少了冗余计算。实验表明,InvarDiff能够实现2至3倍的端到端加速,且对标准质量指标的影响极小。在定性评估中,与完整计算相比,我们观察到视觉质量几乎没有下降。