代表性事项:按顺序决策的离线前培训 (Representation Matters: Offline Pretraining for Sequential Decision Making) - 专知论文

会员服务 ·

0

Performer · 学成 · AIM · Extensibility · 无监督 ·

2021 年 2 月 11 日

Representation Matters: Offline Pretraining for Sequential Decision Making

翻译：代表性事项:按顺序决策的离线前培训

Mengjiao Yang,Ofir Nachum

The recent success of supervised learning methods on ever larger offline datasets has spurred interest in the reinforcement learning (RL) field to investigate whether the same paradigms can be translated to RL algorithms. This research area, known as offline RL, has largely focused on offline policy optimization, aiming to find a return-maximizing policy exclusively from offline data. In this paper, we consider a slightly different approach to incorporating offline data into sequential decision-making. We aim to answer the question, what unsupervised objectives applied to offline datasets are able to learn state representations which elevate performance on downstream tasks, whether those downstream tasks be online RL, imitation learning from expert demonstrations, or even offline policy optimization based on the same offline dataset? Through a variety of experiments utilizing standard offline RL datasets, we find that the use of pretraining with unsupervised learning objectives can dramatically improve the performance of policy learning algorithms that otherwise yield mediocre performance on their own. Extensive ablations further provide insights into what components of these unsupervised objectives -- e.g., reward prediction, continuous or discrete representations, pretraining or finetuning -- are most important and in which settings.

翻译：在更大的离线数据集方面,最近监督监督学习方法的成功,激发了人们对强化学习(RL)领域的兴趣,以调查是否可以将同样的范式转化为RL算法。这个研究领域被称为离线RL,主要侧重于离线政策优化,目的是寻找完全从离线数据返回最大化政策。在本文中,我们认为,将离线数据纳入顺序决策的方法略有不同。我们的目标是回答问题,哪些适用于离线数据集的未监督目标能够了解提升下游任务绩效的状态表现,这些下游任务是在线RL,从专家演示中模仿学习,还是基于同一离线数据集的离线政策优化?通过利用标准的离线数据设置进行各种实验,我们发现,使用未经监督的学习目标进行预先培训,可以大大改进政策学习算法的性,否则会自行产生中度业绩。通过广泛整理,进一步深入了解这些未监督目标的构成哪些组成部分 -- 例如:在线的RL,从专家演示中模仿,还是基于同一离线数据设置的离线政策优化,在最重要的预测、持续或离线的演练中进行最重要的演练。

0

相关内容

Performer

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

专知会员服务

58+阅读 · 2020年5月21日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning

Arxiv

0+阅读 · 2021年4月4日

A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Arxiv

0+阅读 · 2021年4月1日

Return-Based Contrastive Representation Learning for Reinforcement Learning

Arxiv

10+阅读 · 2021年2月22日

Multi-Task Learning for Dense Prediction Tasks: A Survey

Multi-Task Learning for Dense Prediction Tasks: A Survey

Arxiv

5+阅读 · 2020年9月16日

Contrastive Multi-View Representation Learning on Graphs

Arxiv

5+阅读 · 2020年6月10日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

A Comprehensive Comparison of Unsupervised Network Representation Learning Methods

Arxiv

5+阅读 · 2019年3月19日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Learning Compositional Representations for Few-Shot Recognition

Learning Compositional Representations for Few-Shot Recognition

Arxiv

5+阅读 · 2018年12月21日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

专知会员服务

58+阅读 · 2020年5月21日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津大学博士论文】迈向可信 AI：从局部可解释性到因果理解

【ICCV2025】InfGen：一种分辨率无关的可扩展图像合成范式

《美军条令：陆军住院医疗服务》2025最新258页

大语言模型在时间序列中的推理与智能体系统综述

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning

Arxiv

0+阅读 · 2021年4月4日

A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Arxiv

0+阅读 · 2021年4月1日

Return-Based Contrastive Representation Learning for Reinforcement Learning

Arxiv

10+阅读 · 2021年2月22日

Multi-Task Learning for Dense Prediction Tasks: A Survey

Multi-Task Learning for Dense Prediction Tasks: A Survey

Arxiv

5+阅读 · 2020年9月16日

Contrastive Multi-View Representation Learning on Graphs

Arxiv

5+阅读 · 2020年6月10日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

A Comprehensive Comparison of Unsupervised Network Representation Learning Methods

Arxiv

5+阅读 · 2019年3月19日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Learning Compositional Representations for Few-Shot Recognition

Learning Compositional Representations for Few-Shot Recognition

Arxiv

5+阅读 · 2018年12月21日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员