While extensive-form games (EFGs) can be converted into normal-form games (NFGs), doing so comes at the cost of an exponential blowup of the strategy space. So, progress on NFGs and EFGs has historically followed separate tracks, with the EFG community often having to catch up with advances (e.g., last-iterate convergence and predictive regret bounds) from the larger NFG community. In this paper we show that the Optimistic Multiplicative Weights Update (OMWU) algorithm -- the premier learning algorithm for NFGs -- can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick. The resulting algorithm, Kernelized OMWU (KOMWU), applies more broadly to all convex games whose strategy space is a polytope with 0/1 integral vertices, as long as the kernel can be evaluated efficiently. In the particular case of EFGs, KOMWU closes several standing gaps between NFG and EFG learning, by enabling direct, black-box transfer to EFGs of desirable properties of learning dynamics that were so far known to be achievable only in NFGs. Specifically, KOMWU gives the first algorithm that guarantees at the same time last-iterate convergence, lower dependence on the size of the game tree than all prior algorithms, and $\tilde{\mathcal{O}}(1)$ regret when followed by all players.
翻译:虽然大形游戏可以转换成普通游戏 {NFGs } (EFGs), 但这样做的代价是以战略空间的指数式打击为代价。 因此, NFGs 和 EFGs 的进展历来遵循不同的轨迹, EFG 社区往往要赶上更大的NFG 社区的进步(例如,最后的地势趋同和预测的遗憾界限) 。 在本文中,我们显示最佳的多倍增重更新(OMWU) 算法(OMWU) 算法(NFGs的主要学习算法) 可以模拟在游戏树形大小的直线时间相当于EFG的正常形式。 由此产生的算法( Kernelized OMWU (KOMWU), 更广泛地适用于战略空间为 0/1 整体脊椎的组合, 只要能够有效地评估核心值(OMFG) 的组合值。 在EFG 之前所有理想的趋同值轨算法中, KOMWU(KOWU) 的趋同值的趋同值, 在最接近的趋同值上, 直向前GG 和最可实现的变式变式变码的变法性能, 。