Inter-agent communication serves as an effective mechanism for enhancing performance in collaborative multi-agent reinforcement learning(MARL) systems. However, the inherent communication latency in practical systems induces both action decision delays and outdated information sharing, impeding MARL performance gains, particularly in time-critical applications like autonomous driving. In this work, we propose a Value-of-Information aware Low-latency Communication(VIL2C) scheme that proactively adjusts the latency distribution to mitigate its effects in MARL systems. Specifically, we define a Value of Information (VOI) metric to quantify the importance of delayed message transmission based on each delayed message's importance. Moreover, we propose a progressive message reception mechanism to adaptively adjust the reception duration based on received messages. We derive the optimized VoI aware resource allocation and theoretically prove the performance advantage of the proposed VIL2C scheme. Extensive experiments demonstrate that VIL2C outperforms existing approaches under various communication conditions. These gains are attributed to the low-latency transmission of high-VoI messages via resource allocation and the elimination of unnecessary waiting periods via adaptive reception duration.
翻译:智能体间通信是提升协作式多智能体强化学习(MARL)系统性能的有效机制。然而,在实际系统中,固有的通信延迟会导致动作决策滞后与信息共享过时,从而阻碍MARL的性能提升,在自动驾驶等时间敏感型应用中尤为明显。本研究提出一种价值感知的低延迟通信方案(VIL2C),通过主动调整延迟分布以缓解其对MARL系统的影响。具体而言,我们定义了信息价值(VOI)度量,依据每条延迟消息的重要性量化其传输价值;同时提出渐进式消息接收机制,根据已接收消息自适应调整接收时长。我们推导出基于VOI感知的优化资源分配策略,并从理论上证明了VIL2C方案的性能优势。大量实验表明,在不同通信条件下,VIL2C均优于现有方法。这些性能增益归因于:通过资源分配实现高VOI消息的低延迟传输,以及通过自适应接收时长消除不必要的等待周期。