Communication is one of the effective means to improve the learning of cooperative policy in multi-agent systems. However, in most real-world scenarios, lossy communication is a prevalent issue. Existing multi-agent reinforcement learning with communication, due to their limited scalability and robustness, struggles to apply to complex and dynamic real-world environments. To address these challenges, we propose a generalized communication-constrained model to uniformly characterize communication conditions across different scenarios. Based on this, we utilize it as a learning prior to distinguish between lossy and lossless messages for specific scenarios. Additionally, we decouple the impact of lossy and lossless messages on distributed decision-making, drawing on a dual mutual information estimatior, and introduce a communication-constrained multi-agent reinforcement learning framework, quantifying the impact of communication messages into the global reward. Finally, we validate the effectiveness of our approach across several communication-constrained benchmarks.
翻译:通信是提升多智能体系统中协作策略学习的有效手段之一。然而,在大多数现实场景中,有损通信是一个普遍存在的问题。现有的带通信的多智能体强化学习方法,由于其有限的可扩展性和鲁棒性,难以应用于复杂且动态的真实环境。为应对这些挑战,我们提出了一种广义的通信约束模型,以统一刻画不同场景下的通信条件。在此基础上,我们将其作为学习先验,以区分特定场景下的有损与无损消息。此外,借鉴双重互信息估计器,我们解耦了有损与无损消息对分布式决策的影响,并引入了一个通信约束的多智能体强化学习框架,将通信消息的影响量化为全局奖励。最后,我们在多个通信约束基准测试中验证了所提方法的有效性。