Diffusion-based generative models demonstrate state-of-the-art performance across various image synthesis tasks, yet their tendency to replicate and amplify dataset biases remains poorly understood. Although previous research has viewed bias amplification as an inherent characteristic of diffusion models, this work provides the first analysis of how sampling algorithms and their hyperparameters influence bias amplification. We empirically demonstrate that samplers for diffusion models -- commonly optimized for sample quality and speed -- have a significant and measurable effect on bias amplification. Through controlled studies with models trained on Biased MNIST, Multi-Color MNIST and BFFHQ, and with Stable Diffusion, we show that sampling hyperparameters can induce both bias reduction and amplification, even when the trained model is fixed. Source code is available at https://github.com/How-I-met-your-bias/how_i_met_your_bias.
翻译:基于扩散的生成模型在各种图像合成任务中展现出最先进的性能,但其复制并放大数据集偏见的倾向仍鲜为人知。尽管先前研究将偏见放大视为扩散模型的内在特性,但本文首次分析了采样算法及其超参数如何影响偏见放大。我们通过实证证明,扩散模型的采样器——通常针对样本质量和速度进行优化——对偏见放大具有显著且可度量的影响。通过对在Biased MNIST、Multi-Color MNIST和BFFHQ数据集上训练的模型以及Stable Diffusion进行对照研究,我们发现即使训练模型固定不变,采样超参数仍可能引发偏见减少或放大。源代码发布于https://github.com/How-I-met-your-bias/how_i_met_your_bias。