目标有条件强化学习的国家代表性学习 (State Representation Learning for Goal-Conditioned Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · 表示学习 · 强化学习 · Continuity · 表示 ·

2022 年 5 月 4 日

State Representation Learning for Goal-Conditioned Reinforcement Learning

翻译：目标有条件强化学习的国家代表性学习

Lorenzo Steccanella,Anders Jonsson

This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

翻译：本文为无报酬的Markov 决策程序提供了一个新的国家代表。想法是,以自我监督的方式学习嵌入空间,让嵌入国之间的距离与它们之间转型所需的最低行动数量相对应。与以往的方法相比,我们的方法并不要求任何领域知识,从离线和无标签数据中学习。我们展示了如何利用这种代表来学习有目标条件的政策,提供了国家和目标之间的相似性概念,以及指导规划和强化学习算法的有用的超长距离。最后,我们用经验验证了我们在经典控制领域和多目标环境中的方法,表明我们的方法可以成功地在大型和/或连续领域学习。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

98+阅读 · 2019年12月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

粒子湍流介质对太阳光退相干性质影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

多孔介质中细观剩余油成因与流动动力学机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

糖皮质激素受体在单纯疱疹病毒感染性面瘫中的作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月19日

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年6月17日

Reinforcement Learning with Action-Free Pre-Training from Videos

Arxiv

0+阅读 · 2022年6月16日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

98+阅读 · 2019年12月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

因果强化学习的统一框架：综述、分类体系、算法与应用

《无人机系统 - 反无人机系统：测试方法》364页

【MIT博士论文】语言模型的推理时学习算法

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月21日

Guided Safe Shooting: model based reinforcement learning with safety constraints

Arxiv

0+阅读 · 2022年6月20日

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月19日

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年6月17日

Reinforcement Learning with Action-Free Pre-Training from Videos

Arxiv

0+阅读 · 2022年6月16日

相关基金

粒子湍流介质对太阳光退相干性质影响研究

国家自然科学基金

0+阅读 · 2015年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

多孔介质中细观剩余油成因与流动动力学机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

糖皮质激素受体在单纯疱疹病毒感染性面瘫中的作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员