多试剂MDP问题中的最佳通信和控制战略 (Optimal communication and control strategies in a multi-agent MDP problem) - 专知论文

会员服务 ·

0

INFORMS · 优化器 · 控制器 · 时间步 · Performance ·

2021 年 4 月 22 日

Optimal communication and control strategies in a multi-agent MDP problem

翻译：多试剂MDP问题中的最佳通信和控制战略

Sagar Sudhakara,Dhruva Kartik,Rahul Jain,Ashutosh Nayyar

The problem of controlling multi-agent systems under different models of information sharing among agents has received significant attention in the recent literature. In this paper, we consider a setup where rather than committing to a fixed information sharing protocol (e.g. periodic sharing or no sharing etc), agents can dynamically decide at each time step whether to share information with each other and incur the resulting communication cost. This setup requires a joint design of agents' communication and control strategies in order to optimize the trade-off between communication costs and control objective. We first show that agents can ignore a big part of their private information without compromising the system performance. We then provide a common information approach based solution for the strategy optimization problem. This approach relies on constructing a fictitious POMDP whose solution (obtained via a dynamic program) characterizes the optimal strategies for the agents. We also show that our solution can be easily modified to incorporate constraints on when and how frequently agents can communicate.

翻译：在最近文献中,不同代理人之间信息共享模式下的多试剂系统控制问题引起了人们的极大关注。在本文件中,我们考虑了一种设置,即代理人可以动态地在每一阶段决定是否彼此分享信息并由此产生通信成本,而不是承诺采用固定的信息共享协议(例如定期共享或不共享等),代理人可以动态地决定是否相互交流信息,这种设置需要联合设计代理人的通信和控制战略,以优化通信成本和控制目标之间的平衡。我们首先表明,代理人可以忽视其私人信息的一大部分而不损害系统性能。我们然后为战略优化问题提供一个基于共同的信息方法。这一方法依赖于建立一个虚构的POMDP,其解决方案(通过动态程序)是代理人的最佳战略的特征。我们还表明,我们的解决办法可以很容易地修改,以纳入对代理人何时和多久能够进行通信的限制。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【硬核书】Linux核心编程|Linux Kernel Programming，741页pdf

【硬核书】Linux核心编程|Linux Kernel Programming，741页pdf

专知会员服务

80+阅读 · 2021年3月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【硬核书】博弈论导论，417页pdf，Game Theory: An Introduction，普林斯顿大学出版社

【硬核书】博弈论导论，417页pdf，Game Theory: An Introduction，普林斯顿大学出版社

专知会员服务

230+阅读 · 2020年4月21日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学Chelsea Finn-NeurIPS 2019】贝叶斯元学习

【斯坦福大学Chelsea Finn-NeurIPS 2019】贝叶斯元学习

专知会员服务

38+阅读 · 2019年12月17日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Informational Design of Dynamic Multi-Agent System

Arxiv

0+阅读 · 2021年6月14日

Decentralized Inertial Best-Response with Voluntary and Limited Communication in Random Communication Networks

Arxiv

0+阅读 · 2021年6月13日

A Game-Theoretic Approach to Multi-Agent Trust Region Optimization

Arxiv

0+阅读 · 2021年6月12日

Optimal Complexity in Decentralized Training

Arxiv

1+阅读 · 2021年6月11日

Safe Reinforcement Learning with Linear Function Approximation

Arxiv

0+阅读 · 2021年6月11日

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Arxiv

0+阅读 · 2021年6月11日

Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory

Arxiv

0+阅读 · 2021年6月9日

Reinforcement Learning for Assignment Problem with Time Constraints

Arxiv

0+阅读 · 2021年6月5日

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Arxiv

0+阅读 · 2021年5月31日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

相关VIP内容

【硬核书】Linux核心编程|Linux Kernel Programming，741页pdf

【硬核书】Linux核心编程|Linux Kernel Programming，741页pdf

专知会员服务

80+阅读 · 2021年3月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【硬核书】博弈论导论，417页pdf，Game Theory: An Introduction，普林斯顿大学出版社

【硬核书】博弈论导论，417页pdf，Game Theory: An Introduction，普林斯顿大学出版社

专知会员服务

230+阅读 · 2020年4月21日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学Chelsea Finn-NeurIPS 2019】贝叶斯元学习

【斯坦福大学Chelsea Finn-NeurIPS 2019】贝叶斯元学习

专知会员服务

38+阅读 · 2019年12月17日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Informational Design of Dynamic Multi-Agent System

Arxiv

0+阅读 · 2021年6月14日

Decentralized Inertial Best-Response with Voluntary and Limited Communication in Random Communication Networks

Arxiv

0+阅读 · 2021年6月13日

A Game-Theoretic Approach to Multi-Agent Trust Region Optimization

Arxiv

0+阅读 · 2021年6月12日

Optimal Complexity in Decentralized Training

Arxiv

1+阅读 · 2021年6月11日

Safe Reinforcement Learning with Linear Function Approximation

Arxiv

0+阅读 · 2021年6月11日

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Arxiv

0+阅读 · 2021年6月11日

Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory

Arxiv

0+阅读 · 2021年6月9日

Reinforcement Learning for Assignment Problem with Time Constraints

Arxiv

0+阅读 · 2021年6月5日

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Arxiv

0+阅读 · 2021年5月31日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员