Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their representational power to address the problems on which they fail. In this work, we empirically investigate the representational power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results quantify how well various approaches can represent the requisite value functions, and help us identify issues that can impede good performance.
翻译:近年来,在多试剂合作系统中应用了深强化学习技术,取得了巨大的成功经验,然而,由于缺乏理论洞察力,仍然不清楚雇用的神经网络正在学习什么,或者我们应该如何加强这些网络的代表性,以解决它们失败的问题。在这项工作中,我们从经验上调查了一系列一次性游戏中各种网络结构的代表性力量。这些游戏尽管简单,但捕捉了多试剂环境中出现的许多关键问题,例如联合行动数量激增,或者缺乏明确的协调机制。我们的结果可以量化各种办法能如何很好地代表必要的价值功能,帮助我们找出阻碍良好业绩的问题。