未来的关键点:基于示范的加强学习中的自我监督的函文 (Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning)

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. Videos and supplementary materials available at https://sites.google.com/view/keypointsintothefuture

翻译：预测模型是许多机器人系统的核心,从梯子到行走机器人,然而,由于图像等高维感官观测,开发和应用这类模型到实际机器人操作一直具有挑战性。以前在机器人操作背景下学习模型的方法要么学习了整个图像动态,要么使用自动编码器在低维潜伏状态中学习动态。在这项工作中,我们引入了以模型为基础的预测,进行自我监督的视觉通信学习,并表明不仅确实有可能这样做,而且表明这些类型的预测模型表明,相对于基于视觉的RL的替代方法而言,通过自动编码器型的视觉培训,具有令人信服的性能改进。通过模拟实验,我们证明,我们的模型提供了更好的一般化精确度,特别是在3D场景,涉及封闭的场景,以及分类化。此外,我们确认,我们的方法通过硬件实验有效地转移到了现实世界。在https://sites.google.com/view/keypointsintintofutreture中可以找到的视频和补充材料。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【CMU-Google-斯坦福】可控行为的弱监督强化学习，Weakly-Supervised RL

专知会员服务

22+阅读 · 2020年4月8日