零热政策转移,混合强化学习任务代表 (Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning) - 专知论文

会员服务 ·

0

泛化理论 · Learning · 组合性 · 表示 · AIM ·

2022 年 10 月 1 日

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning

翻译：零热政策转移,混合强化学习任务代表

Zheng Wu,Yichen Xie,Wenzhao Lian,Changhao Wang,Yanjiang Guo,Jianyu Chen,Stefan Schaal,Masayoshi Tomizuka

from arxiv, 7 pages, 9 figures

Humans are capable of abstracting various tasks as different combinations of multiple attributes. This perspective of compositionality is vital for human rapid learning and adaption since previous experiences from related tasks can be combined to generalize across novel compositional settings. In this work, we aim to achieve zero-shot policy generalization of Reinforcement Learning (RL) agents by leveraging the task compositionality. Our proposed method is a meta- RL algorithm with disentangled task representation, explicitly encoding different aspects of the tasks. Policy generalization is then performed by inferring unseen compositional task representations via the obtained disentanglement without extra exploration. The evaluation is conducted on three simulated tasks and a challenging real-world robotic insertion task. Experimental results demonstrate that our proposed method achieves policy generalization to unseen compositional tasks in a zero-shot manner.

翻译：人类能够将各种任务抽取为多种属性的不同组合。这种构成性观点对于人类快速学习和适应至关重要,因为以往相关任务的经验可以结合到新的构成环境中加以概括。在这项工作中,我们的目标是通过利用任务构成性实现加强学习代理的零弹政策概括化。我们提议的方法是一种元-RL算法,具有分解的任务代表,明确将任务的不同方面编码。然后,政策一般化是通过获得的解析而无需额外探索而推断出看不见的构成性任务表述。评价针对三种模拟任务和具有挑战性的现实世界机器人插入任务进行。实验结果表明,我们拟议的方法实现了政策概括化,以零弹射方式将无形的构成任务转化为无形的构成任务。

0

相关内容

泛化理论

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

新型酸碱双功能吸附材料的设计、制备及其对废水无机污染离子的吸附研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋放线菌ACMA006抗肿瘤活性物质抑制肝癌的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

中药活性多糖对病毒感染免疫细胞组蛋白乙酰化修饰及相关基因表达调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型共价有机聚合物多孔材料对抗生素污染物的吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

考虑弥散尺度效应的裂隙介质中溶质运移模型及模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚电解质多层膜中的抗衡离子与多层膜功能化研究

国家自然科学基金

0+阅读 · 2011年12月31日

宫内营养不良胎鼠胰岛ε32454;胞分化及其调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

海藻纤维吸附含重金属废水的尺寸效应与吸附机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

环境水体中纳米银的分离测定方法与迁移转化研究

国家自然科学基金

0+阅读 · 2009年12月31日

Progress and summary of reinforcement learning on energy management of MPS-EV

Arxiv

0+阅读 · 2022年11月8日

Reinforcement Learning with Stepwise Fairness Constraints

Arxiv

0+阅读 · 2022年11月8日

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月4日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型中的检索与结构化增强生成综述

《实现多层防御多轮交战机制的扩展型随机齐射模型》2025年最新83页

【CMU博士论文】交互驱动的人体动作估计与生成

如何避免生成式人工智能在作战中失控失效

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Progress and summary of reinforcement learning on energy management of MPS-EV

Arxiv

0+阅读 · 2022年11月8日

Reinforcement Learning with Stepwise Fairness Constraints

Arxiv

0+阅读 · 2022年11月8日

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月4日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

新型酸碱双功能吸附材料的设计、制备及其对废水无机污染离子的吸附研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋放线菌ACMA006抗肿瘤活性物质抑制肝癌的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

中药活性多糖对病毒感染免疫细胞组蛋白乙酰化修饰及相关基因表达调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型共价有机聚合物多孔材料对抗生素污染物的吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

考虑弥散尺度效应的裂隙介质中溶质运移模型及模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚电解质多层膜中的抗衡离子与多层膜功能化研究

国家自然科学基金

0+阅读 · 2011年12月31日

宫内营养不良胎鼠胰岛ε32454;胞分化及其调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

海藻纤维吸附含重金属废水的尺寸效应与吸附机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

环境水体中纳米银的分离测定方法与迁移转化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员