Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.
翻译:在文献中,对双玩者、常数游戏进行了良好的研究,但是在这一环境之外,进展有限。 我们提议了联合政策-空间反应神谕(JPSRO),这是培训正玩者、一般和广泛形式游戏的代理人员的算法,可以观察到,这种算法可以与平衡相趋同。 我们还建议,相关的平衡(CE)是很有希望的元溶液,并提出一个新的解决方案概念“最大基尼-科尔相关平衡(MGCE) ”, 是一个原则性和计算效率高的解决方案组合,以解决相关的均衡选择问题。 我们用日志(JPSRO)的CE元溶液进行数个实验,并展示对正玩者、一般和游戏的趋同。