Sponsored search ads appear next to search results when people look for products and services on search engines. In recent years, they have become one of the most lucrative channels for marketing. As the fundamental basis of search ads, relevance modeling has attracted increasing attention due to the significant research challenges and tremendous practical value. Most existing approaches solely rely on the semantic information in the input query-ad pair, while the pure semantic information in the short ads data is not sufficient to fully identify user's search intents. Our motivation lies in incorporating the tremendous amount of unsupervised user behavior data from the historical search logs as the complementary graph to facilitate relevance modeling. In this paper, we extensively investigate how to naturally fuse the semantic textual information with the user behavior graph, and further propose three novel AdsGNN models to aggregate topological neighborhood from the perspectives of nodes, edges and tokens. Furthermore, two critical but rarely investigated problems, domain-specific pre-training and long-tail ads matching, are studied thoroughly. Empirically, we evaluate the AdsGNN models over the large industry dataset, and the experimental results of online/offline tests consistently demonstrate the superiority of our proposal.
翻译:当人们在搜索引擎上寻找产品和服务时,支持搜索的广告似乎是下一个搜索结果。近年来,它们已成为最有利可图的营销渠道之一。作为搜索广告的根本基础,相关模型由于巨大的研究挑战和巨大的实际价值而引起越来越多的关注。大多数现有方法完全依赖输入查询对配中的语义信息,而短广告数据中的纯语义信息不足以充分识别用户的搜索意图。我们的动机在于将历史搜索日志中大量未经监督的用户行为数据作为辅助图解,以促进相关模型的建立。在本文中,我们广泛研究如何自然地将语义文本信息与用户行为图结合起来,并进一步从节点、边缘和符号的角度提出三个新的ADSGNN模型,以汇总地貌环境。此外,正在彻底研究两个关键但很少调查的问题,即特定领域的培训前和长尾广告的匹配。我们随机地评估了大型行业数据集中的ADGNN模型,以及在线/标本的实验结果。