A key goal of ad hoc teamwork is to develop a learning agent that cooperates with unknown teams, without resorting to any pre-coordination protocol. Despite a vast number of ad hoc teamwork algorithms in the literature, most of them cannot address the problem of learning to cooperate with a completely unknown team, unless it learns from scratch. This article presents a novel approach that uses transfer learning alongside the state-of-the-art PLASTIC-Policy to adapt to completely unknown teammates quickly. We test our solution within the Half Field Offense simulator with five different teammates. The teammates were designed independently by developers from different countries and at different times. Our empirical evaluation shows that it is advantageous for an ad hoc agent to leverage its past knowledge when adapting to a new team instead of learning how to cooperate with it from scratch.
翻译:特设团队工作的一个关键目标是开发一个与未知团队合作的学习代理机构,不诉诸任何协调前协议。尽管文献中有大量的特设团队工作算法,但其中大多数无法解决学习与完全未知团队合作的问题,除非它从零开始学习。这篇文章提出了一个新颖的方法,即与最先进的PLASTIC政策一起,利用转移学习来迅速适应完全未知的队友。我们用5个不同的队友测试我们在半场防御模拟器中的解决方案。队友是由不同国家和不同时期的开发商独立设计的。我们的实证评估表明,在适应新团队时,该特设代理机构利用过去的知识而不是学习如何从零开始与它合作,是有好处的。