GALOPP: 多机构深入强化学习,促进持续监测,并克服地方化的制约因素 (GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints)

Persistently monitoring a region under localization and communication constraints is a challenging problem. In this paper, we consider a heterogenous robotic system consisting of two types of agents -- anchor agents that have accurate localization capability, and auxiliary agents that have low localization accuracy. The auxiliary agents must be within the communication range of an {anchor}, directly or indirectly to localize itself. The objective of the robotic team is to minimize the uncertainty in the environment through persistent monitoring. We propose a multi-agent deep reinforcement learning (MADRL) based architecture with graph attention called Graph Localized Proximal Policy Optimization (GALLOP), which incorporates the localization and communication constraints of the agents along with persistent monitoring objective to determine motion policies for each agent. We evaluate the performance of GALLOP on three different custom-built environments. The results show the agents are able to learn a stable policy and outperform greedy and random search baseline approaches.

翻译：持续监测地方化和通信受限的区域是一个具有挑战性的问题。在本文件中,我们考虑由两种类型的代理人组成的异质机器人系统 -- -- 定位能力准确的固定代理人和本地化准确度低的辅助代理人。辅助代理人必须处于{anchor}的通信范围之内,直接或间接地使自己本地化。机器人团队的目标是通过持续监测最大限度地减少环境中的不确定性。我们建议建立一个多剂深度强化学习(MADRL)结构,以图解为主,称为 " 本地化优化政策图表(GALLOP) ",它包括地方化和通信限制以及确定每个代理人运动政策的持续监测目标。我们评估GALLOP在三个不同的定制环境中的绩效。结果显示,这些代理人能够学习稳定的政策,超越贪婪和随机搜索基线方法。

相关内容

深度强化学习

关注 154

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日