Zero-Sum多玩家广泛形式运动会中的 Ex Ante Ante 协调队-Maxmin Equiblibria (Computing Ex Ante Coordinated Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games)

from arxiv, AAAI 2021. This paper also is a part of the thesis: Youzhi Zhang, February 2020. Computing Team-Maxmin Equilibria in Zero-Sum Multiplayer Games. PhD Thesis, https://personal.ntu.edu.sg/boan/thesis/Zhang_Youzhi_PhD_Thesis.pdf

Computational game theory has many applications in the modern world in both adversarial situations and the optimization of social good. While there exist many algorithms for computing solutions in two-player interactions, finding optimal strategies in multiplayer interactions efficiently remains an open challenge. This paper focuses on computing the multiplayer Team-Maxmin Equilibrium with Coordination device (TMECor) in zero-sum extensive-form games. TMECor models scenarios when a team of players coordinates ex ante against an adversary. Such situations can be found in card games (e.g., in Bridge and Poker), when a team works together to beat a target player but communication is prohibited; and also in real world, e.g., in forest-protection operations, when coordinated groups have limited contact during interdicting illegal loggers. The existing algorithms struggle to find a TMECor efficiently because of their high computational costs. To compute a TMECor in larger games, we make the following key contributions: (1) we propose a hybrid-form strategy representation for the team, which preserves the set of equilibria; (2) we introduce a column-generation algorithm with a guaranteed finite-time convergence in the infinite strategy space based on a novel best-response oracle; (3) we develop an associated-representation technique for the exact representation of the multilinear terms in the best-response oracle; and (4) we experimentally show that our algorithm is several orders of magnitude faster than prior state-of-the-art algorithms in large games.

翻译：计算游戏理论在现代世界中有许多应用, 包括对抗性游戏和优化社会公益。虽然在两个玩家互动中有许多计算解决方案的算法, 但找到多玩者互动的最佳战略仍然是一项公开的挑战。本文侧重于在零和广泛形式的游戏中计算多玩者团队- Maxmin Equilium 与协调设备( TMECor) 之间的平衡。当一个玩家团队对一个对手进行前期协调时, TMECor 模型的情景在较快的游戏中具有许多应用性。在纸牌游戏中( 例如, 在Bridge和Poker), 当一个团队共同努力击败一个目标玩家, 但通信却被禁止; 在现实世界中, 比如在森林保护行动中, 当协调的团体在拦截非法日志时接触有限。现有的算法由于它们计算成本高, 很难找到一个TMECor 。为了在更大的游戏中计算出一个TMECor, 我们做出以下关键贡献:(1) 我们提议了一个混合的策略代表状态, 以保持一个松动式的组合; (2) 我们提出一个以最接近的算式的货币代表, 以我们最有保证的策略以最有保证的的极的的的的直压。