影响稳定多机构互动 (Influencing Towards Stable Multi-Agent Interactions) - 专知论文

会员服务 ·

0

INTERACT · 学成 · 机器人 · HTTPS · 潜在 ·

2021 年 10 月 5 日

Influencing Towards Stable Multi-Agent Interactions

翻译：影响稳定多机构互动

Woodrow Z. Wang,Andy Shih,Annie Xie,Dorsa Sadigh

from arxiv, 15 pages, 5 figures, Published as an Oral at Conference on Robot Learning (CoRL) 2021

Learning in multi-agent environments is difficult due to the non-stationarity introduced by an opponent's or partner's changing behaviors. Instead of reactively adapting to the other agent's (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent's strategy to stabilize -- which can restrain the non-stationarity caused by the other agent. We learn a low-dimensional latent representation of the other agent's strategy and the dynamics of how the latent strategy evolves with respect to our robot's behavior. With this learned dynamics model, we can define an unsupervised stability reward to train our robot to deliberately influence the other agent to stabilize towards a single strategy. We demonstrate the effectiveness of stabilizing in improving efficiency of maximizing the task reward in a variety of simulated environments, including autonomous driving, emergent communication, and robotic manipulation. We show qualitative results on our website: https://sites.google.com/view/stable-marl/.

翻译：多试剂环境中的学习是困难的,因为对手或伙伴不断变化的行为引入了非常态性。我们提出一种算法,以主动地影响其他代理人(反对者或伙伴)的稳定战略 -- -- 它可以抑制另一个代理人造成的非常态性。我们了解到另一个代理人的战略的低维潜值代表,以及潜伏战略在机器人行为方面如何演化的动态。有了这种学习的动态模型,我们可以确定一种不受监督的稳定奖,以训练我们的机器人,蓄意影响另一个代理人稳定地采取单一战略。我们展示了稳定在各种模拟环境中,包括自主驾驶、紧急通信和机器人操纵,提高任务奖励效率的效率。我们在网站上展示了质量成果:https://sites.gogle.com/view/stable-marl/。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

116+阅读 · 2021年1月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

11+阅读 · 2019年4月26日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

Arxiv

0+阅读 · 2021年12月7日

Combining optimal control and learning for autonomous aerial navigation in novel indoor environments

Arxiv

0+阅读 · 2021年12月7日

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with Base Controllers

Arxiv

0+阅读 · 2021年12月4日

Multi-agent Reinforcement Learning for Decentralized Stable Matching

Arxiv

0+阅读 · 2021年12月4日

B-GAP: Behavior-Rich Simulation and Navigation for Autonomous Driving

B-GAP: Behavior-Rich Simulation and Navigation for Autonomous Driving

Arxiv

0+阅读 · 2021年12月3日

Strategically revealing intentions in General Lotto games

Arxiv

0+阅读 · 2021年12月3日

Personal Comfort Estimation in Partial Observable Environment using Reinforcement Learning

Arxiv

0+阅读 · 2021年12月3日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

VIP会员

文章信息

相关主题

相关VIP内容

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

116+阅读 · 2021年1月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能书（SmartBook）：面向情报分析师的AI辅助态势报告生成工具 | 附文献

《战伤医疗训练：结合实体与数字资产的轻量化模拟器概念原型设计与评估》66页

《知识增强型大语言模型及面向创造力支持的人机协作框架》233页

《马赛克战：空间赋能杀伤网高级分析（AASK）》2025最新文献

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

11+阅读 · 2019年4月26日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

Arxiv

0+阅读 · 2021年12月7日

Combining optimal control and learning for autonomous aerial navigation in novel indoor environments

Arxiv

0+阅读 · 2021年12月7日

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with Base Controllers

Arxiv

0+阅读 · 2021年12月4日

Multi-agent Reinforcement Learning for Decentralized Stable Matching

Arxiv

0+阅读 · 2021年12月4日

B-GAP: Behavior-Rich Simulation and Navigation for Autonomous Driving

B-GAP: Behavior-Rich Simulation and Navigation for Autonomous Driving

Arxiv

0+阅读 · 2021年12月3日

Strategically revealing intentions in General Lotto games

Arxiv

0+阅读 · 2021年12月3日

Personal Comfort Estimation in Partial Observable Environment using Reinforcement Learning

Arxiv

0+阅读 · 2021年12月3日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

微信扫码咨询专知VIP会员