Modeling user purchase behavior is a critical challenge in display advertising systems, necessary for real-time bidding. The difficulty arises from the sparsity of positive user events and the stochasticity of user actions, leading to severe class imbalance and irregular event timing. Predictive systems usually rely on hand-crafted "counter" features, overlooking the fine-grained temporal evolution of user intent. Meanwhile, current sequential models extract direct sequential signal, missing useful event-counting statistics. We enhance deep sequential models with self-supervised pretraining strategies for display advertising. Especially, we introduce Abacus, a novel approach of predicting the empirical frequency distribution of user events. We further propose a hybrid objective unifying Abacus with sequential learning objectives, combining stability of aggregated statistics with the sequence modeling sensitivity. Experiments on two real-world datasets show that Abacus pretraining outperforms existing methods accelerating downstream task convergence, while hybrid approach yields up to +6.1% AUC compared to the baselines.
翻译:在展示广告系统中,建模用户购买行为是实时竞价的关键挑战。其难点源于用户正向事件的稀疏性和用户行为的随机性,导致严重的类别不平衡和事件时间不规则。预测系统通常依赖于手工构建的“计数器”特征,忽略了用户意图的细粒度时间演化。同时,当前序列模型仅提取直接的序列信号,缺失了有用的事件计数统计信息。我们通过自监督预训练策略增强深度序列模型在展示广告中的应用。特别地,我们提出了Abacus,一种预测用户事件经验频率分布的新方法。我们进一步提出了一种混合目标,将Abacus与序列学习目标统一起来,结合了聚合统计的稳定性和序列建模的敏感性。在两个真实世界数据集上的实验表明,Abacus预训练优于现有方法,加速了下游任务的收敛,同时混合方法相较于基线实现了高达+6.1%的AUC提升。