Generative models are increasingly being explored in click-through rate (CTR) prediction field to overcome the limitations of the conventional discriminative paradigm, which rely on a simple binary classification objective. However, existing generative models typically confine the generative paradigm to the training phase, primarily for representation learning. During online inference, they revert to a standard discriminative paradigm, failing to leverage their powerful generative capabilities to further improve prediction accuracy. This fundamental asymmetry between the training and inference phases prevents the generative paradigm from realizing its full potential. To address this limitation, we propose the Symmetric Masked Generative Paradigm for CTR prediction (SGCTR), a novel framework that establishes symmetry between the training and inference phases. Specifically, after acquiring generative capabilities by learning feature dependencies during training, SGCTR applies the generative capabilities during online inference to iteratively redefine the features of input samples, which mitigates the impact of noisy features and enhances prediction accuracy. Extensive experiments validate the superiority of SGCTR, demonstrating that applying the generative paradigm symmetrically across both training and inference significantly unlocks its power in CTR prediction.
翻译:生成模型在点击率(CTR)预测领域正被日益探索,以克服传统判别范式的局限性,后者依赖于简单的二元分类目标。然而,现有的生成模型通常将生成范式局限于训练阶段,主要用于表征学习。在线推断时,它们会回归到标准的判别范式,未能利用其强大的生成能力进一步提升预测准确性。训练与推断阶段之间的这种根本性不对称,阻碍了生成范式充分发挥其潜力。为解决这一局限,我们提出了用于CTR预测的对称掩码生成范式(SGCTR),这是一个在训练与推断阶段建立对称性的新颖框架。具体而言,SGCTR通过在训练中学习特征依赖关系获得生成能力后,在线推断时应用该生成能力迭代地重新定义输入样本的特征,从而减轻噪声特征的影响并提升预测准确性。大量实验验证了SGCTR的优越性,表明在训练和推断中对称地应用生成范式能显著释放其在CTR预测中的潜力。