Multi-agent reinforcement learning (MARl) has achieved strong results in cooperative tasks but typically assumes fixed, fully controlled teams. Ad hoc teamwork (AHT) relaxes this by allowing collaboration with unknown partners, yet existing variants still presume shared conventions. We introduce Multil-party Ad Hoc Teamwork (MAHT), where controlled agents must coordinate with multiple mutually unfamiliar groups of uncontrolled teammates. To address this, we propose MARs, which builds a sparse skeleton graph and applies relational modeling to capture cross-group dvnamics. Experiments on MPE and starCralt ll show that MARs outperforms MARL and AHT baselines while converging faster.
翻译:多智能体强化学习(MARL)在协作任务中取得了显著成果,但通常假设团队结构固定且完全可控。临时团队协作(AHT)通过允许与未知伙伴合作放宽了这一假设,然而现有变体仍假定存在共享的协作惯例。本文提出多智能体临时团队协作(MAHT),其中受控智能体需要与多个互不熟悉的非受控队友群体进行协调。为解决此问题,我们提出了MARs方法,该方法构建稀疏骨架图并应用关系建模以捕捉跨群体动态。在MPE和StarCraft II环境中的实验表明,MARs在收敛速度更快的同时,其性能优于MARL和AHT基线方法。