Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning - 专知论文

会员服务 ·

0

Agent · 情景 · Learning · Processing（编程语言） · 强化学习 ·

2023 年 6 月 5 日

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

翻译：暂无翻译

Pedro P. Santos,Diogo S. Carvalho,Miguel Vasco,Alberto Sardinha,Pedro A. Santos,Ana Paiva,Francisco S. Melo

We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. To formalize our setting, we define a new class of multi-agent partially observable Markov decision processes (POMDPs) that we name hybrid-POMDPs, which explicitly model a communication process between the agents. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations at execution time. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the negative impact of partial observability in MARL. Experimental results show that our method consistently outperforms relevant baselines, allowing agents to act with faulty communication while successfully exploiting shared information.

翻译：暂无翻译

0

相关内容

Agent

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【经典书】量化金融导论，192页pdf，哈佛大学Stephen Blyth著作

【经典书】量化金融导论，192页pdf，哈佛大学Stephen Blyth著作

专知会员服务

97+阅读 · 2022年4月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

表面等离激元调控的纳米结构聚焦与波导

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

G-期望理论及其在递归效用、资产定价和动态风险管理中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

靶向性Treg诱导Kupffer细胞m2极化对重症急性胰腺炎的治疗作用

国家自然科学基金

0+阅读 · 2009年12月31日

Communication-Efficient Orchestrations for URLLC Service via Hierarchical Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

Submodular Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Arxiv

0+阅读 · 2023年7月21日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【经典书】量化金融导论，192页pdf，哈佛大学Stephen Blyth著作

【经典书】量化金融导论，192页pdf，哈佛大学Stephen Blyth著作

专知会员服务

97+阅读 · 2022年4月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能改善军事警察行动：当下现状探索》最新95页报告

《用于适应性、任务就绪型军用仿生机器人的合成数据管道》

面向现代武装力量的高级AI驱动军事模拟与训练软件

《军事应用中的AI：建立信任》最新报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Communication-Efficient Orchestrations for URLLC Service via Hierarchical Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

Submodular Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Arxiv

0+阅读 · 2023年7月21日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

表面等离激元调控的纳米结构聚焦与波导

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

G-期望理论及其在递归效用、资产定价和动态风险管理中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

靶向性Treg诱导Kupffer细胞m2极化对重症急性胰腺炎的治疗作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员