通过深强化学习多代理探索未知的稀土地标综合体 (Multi-Agent Exploration of an Unknown Sparse Landmark Complex via Deep Reinforcement Learning) - 专知论文

会员服务 ·

0

回合 · Learning · 稀疏 · 深度强化学习 · GROUP ·

2022 年 9 月 23 日

Multi-Agent Exploration of an Unknown Sparse Landmark Complex via Deep Reinforcement Learning

翻译：通过深强化学习多代理探索未知的稀土地标综合体

Xiatao Sun,Yuwei Wu,Subhrajit Bhattacharya,Vijay Kumar

In recent years Landmark Complexes have been successfully employed for localization-free and metric-free autonomous exploration using a group of sensing-limited and communication-limited robots in a GPS-denied environment. To ensure rapid and complete exploration, existing works make assumptions on the density and distribution of landmarks in the environment. These assumptions may be overly restrictive, especially in hazardous environments where landmarks may be destroyed or completely missing. In this paper, we first propose a deep reinforcement learning framework for multi-agent cooperative exploration in environments with sparse landmarks while reducing client-server communication. By leveraging recent development on partial observability and credit assignment, our framework can train the exploration policy efficiently for multi-robot systems. The policy receives individual rewards from actions based on a proximity sensor with limited range and resolution, which is combined with group rewards to encourage collaborative exploration and construction of the Landmark Complex through observation of 0-, 1- and 2-dimensional simplices. In addition, we employ a three-stage curriculum learning strategy to mitigate the reward sparsity by gradually adding random obstacles and destroying random landmarks. Experiments in simulation demonstrate that our method outperforms the state-of-the-art landmark complex exploration method in efficiency among different environments with sparse landmarks.

翻译：近年来,Landmart Complexs成功地用于在GPS封闭的环境中利用一组有限和通信限制的遥感机器人进行无地方化和无标准自主勘探,利用一组有限和通信限制的机器人在GPS封闭的环境中进行地方化和无标准自主勘探。为了确保迅速和彻底的勘探,现有工程对地标在环境中的密度和分布进行假设。这些假设可能过于严格,特别是在地标可能被摧毁或完全缺失的危险环境中。在本文件中,我们首先提出一个深度强化学习框架,用于多试剂合作探索,在地标稀少的环境下进行无地标的合作探索,同时减少客户-服务员的通信。通过利用最近开发的部分可观察性和信用分配,我们的框架可以对多机器人系统的探索政策进行有效的培训。该政策从基于距离和分辨率有限的近距离传感器的行动中得到个别的回报,与集体奖励相结合,以鼓励通过观察0、1和2维的隐形物来合作探索和建造地标综合建筑。此外,我们采用三阶段课程学习战略,通过逐步增加随机障碍和销毁随机标志性标定标,来减轻奖励。模拟实验表明我们的方法在复杂的地标地标性环境中超越了不同的地标效率。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

幽门螺杆菌调控lncRNA-AK096550诱导SOCS3促进胰岛素抵抗发生的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤相关巨噬细胞分泌CCL18上调HOTAIR促进食管癌转移

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

新因子hARAP3在AR介导基因转录调控及前列腺癌中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

hMSCs定向汗腺细胞分化中TRAF6信号复合物活化不同NF-κB通路的机制

国家自然科学基金

0+阅读 · 2011年12月31日

前列腺癌转移抑制基因CRMP4及其调控机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

SATB1调控肝素酶的表达促进胃癌侵袭转移？

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

Arxiv

0+阅读 · 2022年11月2日

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Arxiv

0+阅读 · 2022年11月1日

Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning

Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月31日

Enabling Digital Twin in Vehicular Edge Computing: A Multi-Agent Multi-Objective Deep Reinforcement Learning Solution

Arxiv

0+阅读 · 2022年10月31日

Safe and Efficient Manoeuvring for Emergency Vehicles in Autonomous Traffic using Multi-Agent Proximal Policy Optimisation

Arxiv

0+阅读 · 2022年10月31日

Maximizing Quality and Minimizing Cost for VCPS in ISAC-Based Vehicular Networks: A Deep Reinforcement Learning Approach

Arxiv

0+阅读 · 2022年10月31日

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月29日

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Arxiv

0+阅读 · 2022年10月28日

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Arxiv

0+阅读 · 2022年10月28日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

深度强化学习

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新万字长文 | 俄罗斯无人机创新或在乌克兰实现战场空中遮断效应

最新，DeepSeek-R1论文登上Nature封面，附83页补充材料

大型语言模型系统中提示缺陷的分类学

自动驾驶中的轨迹预测大型基础模型：全面综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

Arxiv

0+阅读 · 2022年11月2日

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Arxiv

0+阅读 · 2022年11月1日

Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning

Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月31日

Enabling Digital Twin in Vehicular Edge Computing: A Multi-Agent Multi-Objective Deep Reinforcement Learning Solution

Arxiv

0+阅读 · 2022年10月31日

Safe and Efficient Manoeuvring for Emergency Vehicles in Autonomous Traffic using Multi-Agent Proximal Policy Optimisation

Arxiv

0+阅读 · 2022年10月31日

Maximizing Quality and Minimizing Cost for VCPS in ISAC-Based Vehicular Networks: A Deep Reinforcement Learning Approach

Arxiv

0+阅读 · 2022年10月31日

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月29日

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Arxiv

0+阅读 · 2022年10月28日

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Arxiv

0+阅读 · 2022年10月28日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

幽门螺杆菌调控lncRNA-AK096550诱导SOCS3促进胰岛素抵抗发生的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤相关巨噬细胞分泌CCL18上调HOTAIR促进食管癌转移

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

新因子hARAP3在AR介导基因转录调控及前列腺癌中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

hMSCs定向汗腺细胞分化中TRAF6信号复合物活化不同NF-κB通路的机制

国家自然科学基金

0+阅读 · 2011年12月31日

前列腺癌转移抑制基因CRMP4及其调控机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

SATB1调控肝素酶的表达促进胃癌侵袭转移？

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员