Diffusion Large Language Models (DLLMs) enable fully parallel token decoding but often remain impractical at inference time due to the many denoising iterations required to refine an information-free, fully masked initialization into coherent text. Most existing acceleration methods focus on traversing this generative trajectory more efficiently via improved solvers or sampling strategies. We advance a complementary perspective: shorten the trajectory itself by starting closer to the target distribution through context-aware initialization. We propose a training-free interface that injects prompt-conditioned priors from a lightweight auxiliary model into the diffusion initialization, and instantiate it with two mechanisms: discrete token injection and representation-level embedding interpolation. Because injected priors can be imperfect and unmask-only decoding can over-commit early, we also introduce a simple confidence-based remasking mechanism as a form of prior skepticism. Preliminary evidence on GSM8K suggests that context-aware initialization can substantially reduce denoising iterations (about 35\% fewer function evaluations in our setting), while also exposing a key open challenge: naive warm-starting can degrade final accuracy relative to strong diffusion baselines. We use these findings to motivate a research agenda around calibration, revision mechanisms, and representation alignment for reliable warm-started diffusion decoding.
翻译:扩散大语言模型(DLLMs)支持完全并行的标记解码,但由于需要多次去噪迭代才能将信息缺失、完全掩码的初始状态优化为连贯文本,其在推理时往往仍不实用。现有的大多数加速方法侧重于通过改进求解器或采样策略,以更高效的方式遍历这一生成轨迹。我们提出一种互补视角:通过上下文感知初始化,从更接近目标分布的位置开始,从而缩短轨迹本身。我们提出一种免训练的接口,将来自轻量级辅助模型的提示条件先验注入扩散初始化过程,并通过两种机制实现:离散标记注入和表示层嵌入插值。由于注入的先验可能不完美,且仅解掩码的解码方式可能过早地过度确定,我们还引入了一种简单的基于置信度的重掩码机制,作为先验怀疑的一种形式。在GSM8K上的初步实验表明,上下文感知初始化能显著减少去噪迭代次数(在我们的设置中约减少35%的函数评估次数),同时也揭示了一个关键的开放挑战:朴素的预热启动可能降低最终准确性,相对于强大的扩散基线。我们基于这些发现,提出了一个围绕校准、修正机制和表示对齐的研究议程,旨在实现可靠的预热启动扩散解码。