LS3:长-Horizon Visuomotor控制迭代任务的潜在空间安全套 (LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks) - 专知论文

会员服务 ·

0

回合 · 学成 · Extensibility · INTERACT · 潜在 ·

2021 年 7 月 10 日

LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks

翻译：LS3:长-Horizon Visuomotor控制迭代任务的潜在空间安全套

Albert Wilcox,Ashwin Balakrishna,Brijen Thananjeyan,Joseph E. Gonzalez,Ken Goldberg

from arxiv, Preprint, Under Review. First two authors contributed equally

Reinforcement learning (RL) algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks, but can often exhibit unsafe behaviors and require extensive environment interaction when exploration is unconstrained. A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success (and therefore safety) can be guaranteed. While this approach has been successful in low-dimensions, enforcing this constraint in environments with high-dimensional state spaces, such as images, is challenging. We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations by using suboptimal demonstrations and a learned dynamics model to restrict exploration to the neighborhood of a learned Safe Set where task completion is likely. We evaluate LS3 on 4 domains, including a challenging sequential pushing task in simulation and a physical cable routing task. We find that LS3 can use prior task successes to restrict exploration and learn more efficiently than prior algorithms while satisfying constraints. See https://tinyurl.com/latent-ss for code and supplementary material.

翻译：强化学习(RL)算法在探索高维环境以学习复杂、长方位任务方面表现出令人印象深刻的成功,但在勘探不受限制的情况下,往往会表现出不安全的行为,需要广泛的环境互动。在动态不确定的环境中,安全学习的有希望的战略要求代理商能够有力地返回任务成功(因此也是安全)可以保证的国家。虽然这种方法在低维层面取得了成功,但在图像等高维状态空间环境中实施这种制约是具有挑战性的。我们介绍了“冷冻空间安全套件”(LS3),它将这一战略扩大到具有图像观测的迭代性、长方位任务,使用亚优性演示和学习的动态模型将探索限制在可能完成任务的安全套区附近。我们评估了4个领域的LS3,包括模拟中具有挑战性的连续推进任务和有形电缆定线任务。我们发现,LS3可以在满足各种限制的同时,利用先前的任务成功限制来限制探索和学习比先前的算法效率更高。见 https://tinyur.com/latent-s) 用于代码和补充材料。

0

相关内容

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

【经典书】线性代数，399页pdf，Georgi Shilov经典本科教材

【经典书】线性代数，399页pdf，Georgi Shilov经典本科教材

专知会员服务

76+阅读 · 2021年3月2日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

75+阅读 · 2020年5月5日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

25篇最新CV领域综述性论文速递！涵盖15个方向：目标检测/图像处理/姿态估计/医学影像/人脸识别等方向

专知会员服务

106+阅读 · 2020年4月9日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

Arxiv

0+阅读 · 2021年9月14日

ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation

ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation

Arxiv

0+阅读 · 2021年9月14日

A Hierarchical Control Framework for Drift Maneuvering of Autonomous Vehicles

A Hierarchical Control Framework for Drift Maneuvering of Autonomous Vehicles

Arxiv

0+阅读 · 2021年9月14日

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

Arxiv

0+阅读 · 2021年9月14日

Adaptive Constrained Kinematic Control using Partial or Complete Task-Space Measurements

Arxiv

0+阅读 · 2021年9月14日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年9月13日

Topology-Informed Model Predictive Control for Anticipatory Collision Avoidance on a Ballbot

Arxiv

0+阅读 · 2021年9月10日

Closing the Sim2Real Gap in Dynamic Cloth Manipulation

Arxiv

0+阅读 · 2021年9月10日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年2月1日

VIP会员

文章信息

相关主题

相关VIP内容

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

【经典书】线性代数，399页pdf，Georgi Shilov经典本科教材

【经典书】线性代数，399页pdf，Georgi Shilov经典本科教材

专知会员服务

76+阅读 · 2021年3月2日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

75+阅读 · 2020年5月5日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

25篇最新CV领域综述性论文速递！涵盖15个方向：目标检测/图像处理/姿态估计/医学影像/人脸识别等方向

专知会员服务

106+阅读 · 2020年4月9日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

相关论文

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

Arxiv

0+阅读 · 2021年9月14日

ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation

ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation

Arxiv

0+阅读 · 2021年9月14日

A Hierarchical Control Framework for Drift Maneuvering of Autonomous Vehicles

A Hierarchical Control Framework for Drift Maneuvering of Autonomous Vehicles

Arxiv

0+阅读 · 2021年9月14日

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

Arxiv

0+阅读 · 2021年9月14日

Adaptive Constrained Kinematic Control using Partial or Complete Task-Space Measurements

Arxiv

0+阅读 · 2021年9月14日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年9月13日

Topology-Informed Model Predictive Control for Anticipatory Collision Avoidance on a Ballbot

Arxiv

0+阅读 · 2021年9月10日

Closing the Sim2Real Gap in Dynamic Cloth Manipulation

Arxiv

0+阅读 · 2021年9月10日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Arxiv

4+阅读 · 2018年2月1日

微信扫码咨询专知VIP会员