在电子商务赞助的搜索中实现多目标优化的多机构合作竞标运动会 (Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search)

Bid optimization for online advertising from single advertiser's perspective has been thoroughly investigated in both academic research and industrial practice. However, existing work typically assume competitors do not change their bids, i.e., the wining price is fixed, leading to poor performance of the derived solution. Although a few studies use multi-agent reinforcement learning to set up a cooperative game, they still suffer the following drawbacks: (1) They fail to avoid collusion solutions where all the advertisers involved in an auction collude to bid an extremely low price on purpose. (2) Previous works cannot well handle the underlying complex bidding environment, leading to poor model convergence. This problem could be amplified when handling multiple objectives of advertisers which are practical demands but not considered by previous work. In this paper, we propose a novel multi-objective cooperative bid optimization formulation called Multi-Agent Cooperative bidding Games (MACG). MACG sets up a carefully designed multi-objective optimization framework where different objectives of advertisers are incorporated. A global objective to maximize the overall profit of all advertisements is added in order to encourage better cooperation and also to protect self-bidding advertisers. To avoid collusion, we also introduce an extra platform revenue constraint. We analyze the optimal functional form of the bidding formula theoretically and design a policy network accordingly to generate auction-level bids. Then we design an efficient multi-agent evolutionary strategy for model optimization. Offline experiments and online A/B tests conducted on the Taobao platform indicate both single advertiser's objective and global profit have been significantly improved compared to state-of-art methods.

翻译：从单一广告商的角度对网上广告的投标优化在学术研究和工业实践中都进行了彻底调查。然而,现有工作通常假定竞争者不会改变其投标,即赢利价格是固定的,导致衍生解决方案的绩效差。虽然有几项研究使用多试剂强化学习来建立合作游戏,但他们仍然有以下缺点:(1) 未能避免串通解决办法,因为所有参与拍卖的广告商都故意串通价格极低。 (2) 以往的工作无法很好地处理潜在的复杂投标环境,导致模式趋同不良。在处理广告商的多项目标时,这一问题可能会加剧,而这些目标是实际要求,但先前的工作没有考虑到。在本文件中,我们提出了一个新的多试办多试办的多试办多试办方案。MACG建立了一个精心设计的多目标优化框架,将不同广告商的目标纳入其中。增加了最大限度地提高所有广告总利润的全球目标,以鼓励更好的合作,同时保护自标的广告商。为避免串通,我们还提出了一个新的多点合作优化平台战略设计。我们据此对一个双级的试办的试办标准进行了分析。