D3G: 从示范中学习多机器人协调 (D3G: Learning Multi-robot Coordination from Demonstrations) - 专知论文

会员服务 ·

0

Learning · 机器人 · tuning · 纳什均衡 · 图 ·

2023 年 3 月 20 日

D3G: Learning Multi-robot Coordination from Demonstrations

翻译：D3G: 从示范中学习多机器人协调

Xuan Wang,Yizhi Zhou,Wanxin Jin

This paper develops a Distributed Differentiable Dynamic Game (D3G) framework, which enables learning multi-robot coordination from demonstrations. We represent multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. The coordination thus can be adapted by tuning the objective and dynamics of each robot. The proposed D3G enables each robot to automatically tune its individual dynamics and objectives in a distributed manner by minimizing the mismatch between its trajectory and demonstrations. This learning framework features a new design, including a forward-pass, where all robots collaboratively seek Nash equilibrium of a game, and a backward-pass, where gradients are propagated via the communication graph. We test the D3G in simulation with two types of robots given different task configurations. The results validate the capability of D3G for learning multi-robot coordination from demonstrations.

翻译：本文提出了一种分布式可微动态游戏(D3G)框架，可以从示范中学习多机器人协调。我们将多机器人协调表示为一种动态博弈，在这种博弈中，机器人的行为由自身动态和目标所决定，同时还取决于其他机器人的行为。因此，协调可以通过调整每个机器人的目标和动态来进行。所提出的D3G允许每个机器人通过最小化其轨迹和示范之间的不匹配来分布式自动调整其单独的动态和目标。该学习框架具有新的设计，其中包括一个前向传递，在该传递中，所有机器人协作寻找游戏的纳什均衡，并且一个反向传递，在该传递中，梯度通过通信图传播。我们在两种类型的机器人上测试了D3G，并给出不同的任务配置。结果验证了D3G从示范中学习多机器人协调的能力。

0

相关内容

Learning

【ETH、Stanford】基于博弈论的运动规划，Tutorial ICRA '21

【ETH、Stanford】基于博弈论的运动规划，Tutorial ICRA '21

专知会员服务

56+阅读 · 2022年3月7日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

tRNA质核转运调控在卤虫休眠形成及对极端环境适应与进化中的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

群集分布式协作与目标跟踪的理论与动力学分析

国家自然科学基金

1+阅读 · 2014年12月31日

面向异构环境自主巡航的仿人机器人运动规划及多足平台推广研究

国家自然科学基金

0+阅读 · 2013年12月31日

个性化动态路径诱导建模理论与方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机与动态环境下物流配送区域划分与配送路径集成优化问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

物联网中跨网密钥管理研究

国家自然科学基金

1+阅读 · 2012年12月31日

无人机协同组网感知融合与传感器管理关键技术研究

国家自然科学基金

27+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

基于重力梯度测量的水下安全航行研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于学习与继承的复杂大场景下多智能体接力目标跟踪研究

国家自然科学基金

0+阅读 · 2009年12月31日

Training development for multisensory data analysis

Arxiv

0+阅读 · 2023年5月11日

Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path

Arxiv

0+阅读 · 2023年5月11日

Multi-Robot Coordination and Layout Design for Automated Warehousing

Arxiv

0+阅读 · 2023年5月10日

Sensor Observability Analysis for Maximizing Task-Space Observability of Articulated Robots

Arxiv

0+阅读 · 2023年5月10日

Learning Video-Conditioned Policies for Unseen Manipulation Tasks

Arxiv

0+阅读 · 2023年5月10日

Adaptive Skill Coordination for Robotic Mobile Manipulation

Arxiv

0+阅读 · 2023年5月10日

TidyBot: Personalized Robot Assistance with Large Language Models

Arxiv

0+阅读 · 2023年5月9日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Multiagent Soft Q-Learning

Arxiv

11+阅读 · 2018年4月25日

VIP会员

文章信息

相关主题

相关VIP内容

【ETH、Stanford】基于博弈论的运动规划，Tutorial ICRA '21

【ETH、Stanford】基于博弈论的运动规划，Tutorial ICRA '21

专知会员服务

56+阅读 · 2022年3月7日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

热门VIP内容

开通专知VIP会员享更多权益服务

《复合人工智能决策优势：面向军事行动的人类数字孪生智能体编队与群体建模》最新文献

中文版《整合蓝绿作战域：北约空陆一体化向多域作战演进》2025最新资料

演进中的空中力量指挥控制体系

《在轨空间目标多智能体检测的制导、导航与控制》195页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Training development for multisensory data analysis

Arxiv

0+阅读 · 2023年5月11日

Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path

Arxiv

0+阅读 · 2023年5月11日

Multi-Robot Coordination and Layout Design for Automated Warehousing

Arxiv

0+阅读 · 2023年5月10日

Sensor Observability Analysis for Maximizing Task-Space Observability of Articulated Robots

Arxiv

0+阅读 · 2023年5月10日

Learning Video-Conditioned Policies for Unseen Manipulation Tasks

Arxiv

0+阅读 · 2023年5月10日

Adaptive Skill Coordination for Robotic Mobile Manipulation

Arxiv

0+阅读 · 2023年5月10日

TidyBot: Personalized Robot Assistance with Large Language Models

Arxiv

0+阅读 · 2023年5月9日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Multiagent Soft Q-Learning

Arxiv

11+阅读 · 2018年4月25日

相关基金

tRNA质核转运调控在卤虫休眠形成及对极端环境适应与进化中的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

群集分布式协作与目标跟踪的理论与动力学分析

国家自然科学基金

1+阅读 · 2014年12月31日

面向异构环境自主巡航的仿人机器人运动规划及多足平台推广研究

国家自然科学基金

0+阅读 · 2013年12月31日

个性化动态路径诱导建模理论与方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机与动态环境下物流配送区域划分与配送路径集成优化问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

物联网中跨网密钥管理研究

国家自然科学基金

1+阅读 · 2012年12月31日

无人机协同组网感知融合与传感器管理关键技术研究

国家自然科学基金

27+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

基于重力梯度测量的水下安全航行研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于学习与继承的复杂大场景下多智能体接力目标跟踪研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员