通过语言教学进行元加强学习 (Meta-Reinforcement Learning via Language Instructions) - 专知论文

会员服务 ·

0

Learning · Agent · state-of-the-art · 讲稿 · HTTPS ·

2022 年 9 月 15 日

Meta-Reinforcement Learning via Language Instructions

翻译：通过语言教学进行元加强学习

Zhenshan Bing,Alexander Koch,Xiangtong Yao,Kai Huang,Alois Knoll

Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and progresses in the learning only relying on the reward signal. This is implicit and rather insufficient to learn a task well. On the contrary, humans are usually taught new skills via natural language instructions. Utilizing language instructions for robotic motion control to improve the adaptability is a recently emerged topic and challenging. In this paper, we present a meta-RL algorithm that addresses the challenge of learning skills with language instructions in multiple manipulation tasks. On the one hand, our algorithm utilizes the language instructions to shape its interpretation of the task, on the other hand, it still learns to solve task in a trial-and-error process. We evaluate our algorithm on the robotic manipulation benchmark (Meta-World) and it significantly outperforms state-of-the-art methods in terms of training and testing task success rates. Codes are available at \url{https://tumi6robot.wixsite.com/million}.

翻译：虽然深入强化学习最近在学习复杂行为方面非常成功,但它需要大量的数据来学习一项任务。造成这一限制的一个根本原因在于强化学习的试和试学习模式的性质,即代理人仅依靠奖励信号与环境和学习进展进行交流,这是隐含的,也不足以很好地学习任务。相反,人类通常通过自然语言指导来学习新技能。利用机器人运动控制语言指导来改进适应能力是一个最近出现的主题和挑战。在本文中,我们提出了一个元-RL算法,用多种操作任务的语言指导来解决学习技能的挑战。一方面,我们的算法利用语言指示来制定任务解释,另一方面,它仍然学会在试验和操作过程中解决问题。我们评估机器人操纵基准(Meta-World)的算法,它在培训和测试任务成功率方面大大超出了最新技术的方法。代码可在以下网址/urlmillum{http://umixw.coms@m@mill/mill/millas/millot6中找到。

1

相关内容

Learning

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

NES1基因联合188Re内放射治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

大肠癌SM22表达下调的机制及在NF-κB激活中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

多光子纠缠操纵研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型靶向VPAC1高亲和力多肽的结直肠癌分子显像应用基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

TAp73和DNp73在苯并(a)芘诱导的DNA损伤应激反应中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

多功能5’三磷酸化siRNA靶向治疗恶性肿瘤：联合谷氨酰胺酶沉默与RIG-I信号通路激活

国家自然科学基金

0+阅读 · 2012年12月31日

磷脂酶D在肠癌中的激活及促进肠癌增殖转移的机制

国家自然科学基金

0+阅读 · 2011年12月31日

去乙酰化转移酶（HDAC)抑制剂MS-275对胃癌细胞的选择性杀伤作用及机制

国家自然科学基金

0+阅读 · 2009年12月31日

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月25日

RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control

Arxiv

0+阅读 · 2022年10月25日

Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models

Arxiv

0+阅读 · 2022年10月24日

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

Arxiv

1+阅读 · 2022年10月24日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年10月25日

RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control

Arxiv

0+阅读 · 2022年10月25日

Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models

Arxiv

0+阅读 · 2022年10月24日

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

Arxiv

1+阅读 · 2022年10月24日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

NES1基因联合188Re内放射治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

大肠癌SM22表达下调的机制及在NF-κB激活中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

多光子纠缠操纵研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型靶向VPAC1高亲和力多肽的结直肠癌分子显像应用基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

TAp73和DNp73在苯并(a)芘诱导的DNA损伤应激反应中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

多功能5’三磷酸化siRNA靶向治疗恶性肿瘤：联合谷氨酰胺酶沉默与RIG-I信号通路激活

国家自然科学基金

0+阅读 · 2012年12月31日

磷脂酶D在肠癌中的激活及促进肠癌增殖转移的机制

国家自然科学基金

0+阅读 · 2011年12月31日

去乙酰化转移酶（HDAC)抑制剂MS-275对胃癌细胞的选择性杀伤作用及机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员