有条件的反舞弊政策行动 (Action-Conditioned Contrastive Policy Pretraining) - 专知论文

会员服务 ·

0

学成 · contrastive · 无监督特征学习 · 无监督 · 控制器 ·

2022 年 4 月 5 日

Action-Conditioned Contrastive Policy Pretraining

翻译：有条件的反舞弊政策行动

Qihang Zhang,Zhenghao Peng,Bolei Zhou

Deep visuomotor policy learning achieves promising results in control tasks such as robotic manipulation and autonomous driving, where the action is generated from the visual input by the neural policy. However, it requires a huge number of online interactions with the training environment, which limits its real-world application. Compared to the popular unsupervised feature learning for visual recognition, feature pretraining for visuomotor control tasks is much less explored. In this work, we aim to pretrain policy representations for driving tasks using hours-long uncurated YouTube videos. A new contrastive policy pretraining method is developed to learn action-conditioned features from video frames with action pseudo labels. Experiments show that the resulting action-conditioned features bring substantial improvements to the downstream reinforcement learning and imitation learning tasks, outperforming the weights pretrained from previous unsupervised learning methods. Code and models will be made publicly available.

翻译：深相对摩托政策学习在控制任务(如机器人操纵和自主驾驶)方面取得有希望的成果,而控制任务(如机器人操纵和自主驾驶)的行动是由神经政策视觉投入产生的。然而,它需要大量与培训环境的在线互动,这限制了培训环境的实际应用。与普通的未经监督的特征学习相比,用于视觉识别的特征学习,对用于相对摩托控制任务的特质培训远没有那么深入探讨。在这项工作中,我们的目标是利用未经加工的YouTube视频对驾驶任务的政策表述进行预先培训。我们开发了一个新的对比性政策前培训方法,从带有动作假标签的视频框中学习以行动为条件的特征。实验显示,由此产生的有行动条件的特征给下游强化学习和模仿学习任务带来了重大改进,超过了先前未经监督的学习方法所预先训练的重量。代码和模型将被公开使用。

0

相关内容

【伯克利-Pieter Abbeel】深度强化学习基础，附slides与视频

专知会员服务

29+阅读 · 2021年8月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

豫西中寒武世机会遗迹群落及沉积环境制约

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

43+阅读 · 2015年12月31日

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

面向核电RCV的机器人自适应机理与高效作业方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态未知的异质非线性多智能体系统协调跟踪控制

国家自然科学基金

0+阅读 · 2013年12月31日

云仿真中的计算资源分配方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

玉米穗粒数形成的关键基因克隆与功能解析

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

大气CO2浓度升高条件下水稻生产管理适应策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

含未知参数随机跳变系统的自适应镇定性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Generative or Contrastive? Phrase Reconstruction for Better Sentence Representation Learning

Arxiv

0+阅读 · 2022年4月20日

Reinforced Structured State-Evolution for Vision-Language Navigation

Arxiv

0+阅读 · 2022年4月20日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Point-Level Region Contrast for Object Detection Pre-Training

Arxiv

1+阅读 · 2022年4月19日

Conditional Injective Flows for Bayesian Imaging

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

VIP会员

文章信息

相关主题

无监督特征学习

相关VIP内容

【伯克利-Pieter Abbeel】深度强化学习基础，附slides与视频

专知会员服务

29+阅读 · 2021年8月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《扩展现实技术在美国防部维修训练中的应用》最新32页报告

《数字支柱：北约在新兴颠覆性技术时代的互操作性探索》最新报告

《扩展现实技术在军事教育中的应用：通过沉浸式体验学习疑难知识》最新30页

中文版 | 美军对扩展现实技术的军事应用探索

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Generative or Contrastive? Phrase Reconstruction for Better Sentence Representation Learning

Arxiv

0+阅读 · 2022年4月20日

Reinforced Structured State-Evolution for Vision-Language Navigation

Arxiv

0+阅读 · 2022年4月20日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Point-Level Region Contrast for Object Detection Pre-Training

Arxiv

1+阅读 · 2022年4月19日

Conditional Injective Flows for Bayesian Imaging

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

相关基金

豫西中寒武世机会遗迹群落及沉积环境制约

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

43+阅读 · 2015年12月31日

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

面向核电RCV的机器人自适应机理与高效作业方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态未知的异质非线性多智能体系统协调跟踪控制

国家自然科学基金

0+阅读 · 2013年12月31日

云仿真中的计算资源分配方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

玉米穗粒数形成的关键基因克隆与功能解析

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

大气CO2浓度升高条件下水稻生产管理适应策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

含未知参数随机跳变系统的自适应镇定性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员