语言化采样：如何缓解模式崩溃并释放大语言模型的多样性 (Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity)

Post-training alignment often reduces LLM diversity, leading to a phenomenon known as mode collapse. Unlike prior work that attributes this effect to algorithmic limitations, we identify a fundamental, pervasive data-level driver: typicality bias in preference data, whereby annotators systematically favor familiar text as a result of well-established findings in cognitive psychology. We formalize this bias theoretically, verify it on preference datasets empirically, and show that it plays a central role in mode collapse. Motivated by this analysis, we introduce Verbalized Sampling, a simple, training-free prompting strategy to circumvent mode collapse. VS prompts the model to verbalize a probability distribution over a set of responses (e.g., "Generate 5 jokes about coffee and their corresponding probabilities"). Comprehensive experiments show that VS significantly improves performance across creative writing (poems, stories, jokes), dialogue simulation, open-ended QA, and synthetic data generation, without sacrificing factual accuracy and safety. For instance, in creative writing, VS increases diversity by 1.6-2.1x over direct prompting. We further observe an emergent trend that more capable models benefit more from VS. In sum, our work provides a new data-centric perspective on mode collapse and a practical inference-time remedy that helps unlock pre-trained generative diversity.

翻译：后训练对齐通常会降低大语言模型的多样性，导致一种被称为模式崩溃的现象。与先前研究将此效应归因于算法局限性不同，我们识别出一个根本性的、普遍存在的数据层面驱动因素：偏好数据中的典型性偏差。这种偏差源于认知心理学中已确立的发现，即标注者会系统性地偏爱熟悉的文本。我们从理论上形式化了这种偏差，在偏好数据集上进行了实证验证，并证明了它在模式崩溃中起着核心作用。基于此分析，我们提出了语言化采样，这是一种简单、无需训练的提示策略，用以规避模式崩溃。VS 提示模型对一组响应进行概率分布的语言化描述（例如，“生成5个关于咖啡的笑话及其对应的概率”）。综合实验表明，VS 在创意写作（诗歌、故事、笑话）、对话模拟、开放式问答和合成数据生成方面显著提升了性能，且未牺牲事实准确性和安全性。例如，在创意写作中，VS 将多样性较直接提示提高了1.6-2.1倍。我们进一步观察到一个新兴趋势：能力更强的模型从 VS 中获益更多。总之，我们的工作为模式崩溃提供了一个新的以数据为中心的视角，并提出了一种实用的推理时补救方法，有助于释放预训练生成模型的多样性。