具有通信约束先验的多智能体强化学习 (Multi-Agent Reinforcement Learning with Communication-Constrained Priors)

Communication is one of the effective means to improve the learning of cooperative policy in multi-agent systems. However, in most real-world scenarios, lossy communication is a prevalent issue. Existing multi-agent reinforcement learning with communication, due to their limited scalability and robustness, struggles to apply to complex and dynamic real-world environments. To address these challenges, we propose a generalized communication-constrained model to uniformly characterize communication conditions across different scenarios. Based on this, we utilize it as a learning prior to distinguish between lossy and lossless messages for specific scenarios. Additionally, we decouple the impact of lossy and lossless messages on distributed decision-making, drawing on a dual mutual information estimatior, and introduce a communication-constrained multi-agent reinforcement learning framework, quantifying the impact of communication messages into the global reward. Finally, we validate the effectiveness of our approach across several communication-constrained benchmarks.

翻译：通信是提升多智能体系统中协作策略学习的有效手段之一。然而，在大多数现实场景中，有损通信是一个普遍存在的问题。现有的带通信的多智能体强化学习方法，由于其有限的可扩展性和鲁棒性，难以应用于复杂且动态的真实环境。为应对这些挑战，我们提出了一种广义的通信约束模型，以统一刻画不同场景下的通信条件。在此基础上，我们将其作为学习先验，以区分特定场景下的有损与无损消息。此外，借鉴双重互信息估计器，我们解耦了有损与无损消息对分布式决策的影响，并引入了一个通信约束的多智能体强化学习框架，将通信消息的影响量化为全局奖励。最后，我们在多个通信约束基准测试中验证了所提方法的有效性。