MAGIC: 在线POMDP规划学习宏观行动 (MAGIC: Learning Macro-Actions for Online POMDP Planning) - 专知论文

会员服务 ·

0

Performer · 学成 · 部分可观测马尔可夫决策过程 · 在线 · Principle ·

2021 年 6 月 30 日

MAGIC: Learning Macro-Actions for Online POMDP Planning

翻译：MAGIC: 在线POMDP规划学习宏观行动

Yiyuan Lee,Panpan Cai,David Hsu

from arxiv, 9 pages (+ 2 page references, + 2 page appendix)

The partially observable Markov decision process (POMDP) is a principled general framework for robot decision making under uncertainty, but POMDP planning suffers from high computational complexity, when long-term planning is required. While temporally-extended macro-actions help to cut down the effective planning horizon and significantly improve computational efficiency, how do we acquire good macro-actions? This paper proposes Macro-Action Generator-Critic (MAGIC), which performs offline learning of macro-actions optimized for online POMDP planning. Specifically, MAGIC learns a macro-action generator end-to-end, using an online planner's performance as the feedback. During online planning, the generator generates on the fly situation-aware macro-actions conditioned on the robot's belief and the environment context. We evaluated MAGIC on several long-horizon planning tasks both in simulation and on a real robot. The experimental results show that the learned macro-actions offer significant benefits in online planning performance, compared with primitive actions and handcrafted macro-actions.

翻译：部分可见的Markov 决策程序(POMDP)是一个在不确定情况下进行机器人决策的原则性总体框架,但是,在需要长期规划时,POMDP的规划具有很高的计算复杂性。虽然时间延伸的宏观行动有助于削减有效规划视野,显著提高计算效率,但我们如何获得良好的宏观行动?本文提议宏观行动发电机-cric (MAGIC), 用于在网上规划POMDP时最佳的宏观行动进行离线学习。具体地说, MAGIC 学习宏观行动源端对端, 使用在线规划员的性能作为反馈。在在线规划期间, 生成者在以机器人的信念和环境环境为条件的飞行状况( 觉察宏观行动) 上生成。我们在模拟和真正的机器人上评估了MAGIC 的几项长方位规划任务。实验结果显示, 与原始行动和手工的宏观行动相比, 所学的宏观行动在网上规划性能带来重大好处。

0

相关内容

Performer

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

专知会员服务

45+阅读 · 2020年8月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

专知会员服务

100+阅读 · 2020年3月8日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

专知会员服务

133+阅读 · 2019年10月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【电子书推荐】Data Science with Python and Dask

【电子书推荐】Data Science with Python and Dask

专知会员服务

44+阅读 · 2019年6月1日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

实验室1篇论文被Transactions on SMC: Systems录用

实验室1篇论文被Transactions on SMC: Systems录用

inpluslab

7+阅读 · 2018年10月19日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

Planning from video game descriptions

Planning from video game descriptions

Arxiv

0+阅读 · 2021年9月1日

Learning Practically Feasible Policies for Online 3D Bin Packing

Arxiv

0+阅读 · 2021年8月31日

Receding Horizon Task and Motion Planning in Changing Environments

Arxiv

0+阅读 · 2021年8月29日

Towards Optimally Efficient Search with Deep Learning for Large-Scale MIMO Systems

Arxiv

0+阅读 · 2021年8月29日

Multi-UAV Trajectory Cooperation for Servicing Dynamic Demands and Charging Battery

Multi-UAV Trajectory Cooperation for Servicing Dynamic Demands and Charging Battery

Arxiv

0+阅读 · 2021年8月27日

Dual-arm Coordinated Manipulation for Object Twisting with Human Intelligence

Arxiv

0+阅读 · 2021年8月26日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Arxiv

6+阅读 · 2018年4月20日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

相关VIP内容

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

专知会员服务

45+阅读 · 2020年8月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

专知会员服务

100+阅读 · 2020年3月8日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

清华刘洋与邓力合著338页新书《Deep Learning in Natural Language Processing》

专知会员服务

133+阅读 · 2019年10月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【电子书推荐】Data Science with Python and Dask

【电子书推荐】Data Science with Python and Dask

专知会员服务

44+阅读 · 2019年6月1日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

实验室1篇论文被Transactions on SMC: Systems录用

实验室1篇论文被Transactions on SMC: Systems录用

inpluslab

7+阅读 · 2018年10月19日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

相关论文

Planning from video game descriptions

Planning from video game descriptions

Arxiv

0+阅读 · 2021年9月1日

Learning Practically Feasible Policies for Online 3D Bin Packing

Arxiv

0+阅读 · 2021年8月31日

Receding Horizon Task and Motion Planning in Changing Environments

Arxiv

0+阅读 · 2021年8月29日

Towards Optimally Efficient Search with Deep Learning for Large-Scale MIMO Systems

Arxiv

0+阅读 · 2021年8月29日

Multi-UAV Trajectory Cooperation for Servicing Dynamic Demands and Charging Battery

Multi-UAV Trajectory Cooperation for Servicing Dynamic Demands and Charging Battery

Arxiv

0+阅读 · 2021年8月27日

Dual-arm Coordinated Manipulation for Object Twisting with Human Intelligence

Arxiv

0+阅读 · 2021年8月26日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Arxiv

6+阅读 · 2018年4月20日

微信扫码咨询专知VIP会员