通过时间逻辑在概率保障下进行安全强化学习 (Verifiably Safe Reinforcement Learning with Probabilistic Guarantees via Temporal Logic) - 专知论文

会员服务 ·

0

Performer · Learning · 控制器 · 情景 · 强化学习 ·

2022 年 12 月 12 日

Verifiably Safe Reinforcement Learning with Probabilistic Guarantees via Temporal Logic

翻译：通过时间逻辑在概率保障下进行安全强化学习

Hanna Krasowski,Prithvi Akella,Aaron Ames,Matthias Althoff

Reinforcement Learning (RL) can solve complex tasks but does not intrinsically provide any guarantees on system behavior. For real-world systems that fulfill safety-critical tasks, such guarantees on safety specifications are necessary. To bridge this gap, we propose a verifiably safe RL procedure with probabilistic guarantees. First, our approach probabilistically verifies a candidate controller with respect to a temporal logic specification, while randomizing the controller's inputs within a bounded set. Then, we use RL to improve the performance of this probabilistically verified, i.e. safe, controller and explore in the same bounded set around the controller's input as was randomized over in the verification step. Finally, we calculate probabilistic safety guarantees with respect to temporal logic specifications for the learned agent. Our approach is efficient for continuous action and state spaces and separates safety verification and performance improvement into two independent steps. We evaluate our approach on a safe evasion task where a robot has to evade a dynamic obstacle in a specific manner while trying to reach a goal. The results show that our verifiably safe RL approach leads to efficient learning and performance improvements while maintaining safety specifications.

翻译：强化学习( RL) 能够解决复杂的任务,但并不在本质上为系统行为提供任何保障。对于完成安全关键任务的真实世界系统, 安全规格方面的保障是必要的。为了弥合这一差距, 我们建议了一个可以核查的安全 RL 程序, 并有概率性保障。首先, 我们的方法可以按照时间逻辑规范对候选人控制器进行核查, 同时将控制器的投入在受约束的数据集中随机排序。然后, 我们使用 RL 来改进这个概率性能, 即安全、控制器和探索在控制器输入上设置的相同界限的系统。最后, 我们计算出一个安全性能保障, 以时间逻辑规范为学习的代理器。我们的方法对于持续的行动和状态是有效的, 并将安全核查和性能的改进分为两个独立步骤。我们评估了我们的安全规避任务的方法, 机器人在试图达到目标时, 不得不以特定的方式回避动态障碍。结果显示, 我们的安全性RL 方法在维持安全性规范的同时, 能够被核查的学习和性能提高安全性。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

南黄海辐射沙脊群海域浅海地形SAR成像理论与遥感探测研究

国家自然科学基金

0+阅读 · 2015年12月31日

多频微波辐射计强降雨条件下海洋风场反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机多智能体系统的协调控制

国家自然科学基金

2+阅读 · 2012年12月31日

高温胁迫下拟南芥miR400及其靶基因PPRP的功能及作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

多智能体系统分布式最优化问题

国家自然科学基金

9+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

由凝胶化嵌段共聚物的共混微相分离制备有机无机杂化聚合物纳米粒子

国家自然科学基金

0+阅读 · 2009年12月31日

碳化硅基陶瓷材料高温相平衡研究

国家自然科学基金

0+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

Reinforcement Learning with Almost Sure Constraints

Reinforcement Learning with Almost Sure Constraints

Arxiv

0+阅读 · 2023年2月13日

Hierarchical Motion Planning under Probabilistic Temporal Tasks and Safe-Return Constraints

Arxiv

0+阅读 · 2023年2月10日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

Monte Carlo Neural Operator for Learning PDEs via Probabilistic Representation

Arxiv

0+阅读 · 2023年2月10日

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Arxiv

0+阅读 · 2023年2月8日

AISYN: AI-driven Reinforcement Learning-Based Logic Synthesis Framework

Arxiv

0+阅读 · 2023年2月8日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

全球AI工具市场发展现状与趋势分析2025

自动驾驶地图：全流程综述与前沿进展

协同智能体：多智能体人工智能系统如何变革军事训练及其他领域

【NeurIPS2025】TITAN：一种面向轨迹感知的大规模 VQE 自适应参数冻结技术

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Reinforcement Learning with Almost Sure Constraints

Reinforcement Learning with Almost Sure Constraints

Arxiv

0+阅读 · 2023年2月13日

Hierarchical Motion Planning under Probabilistic Temporal Tasks and Safe-Return Constraints

Arxiv

0+阅读 · 2023年2月10日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

Monte Carlo Neural Operator for Learning PDEs via Probabilistic Representation

Arxiv

0+阅读 · 2023年2月10日

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Arxiv

0+阅读 · 2023年2月8日

AISYN: AI-driven Reinforcement Learning-Based Logic Synthesis Framework

Arxiv

0+阅读 · 2023年2月8日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

南黄海辐射沙脊群海域浅海地形SAR成像理论与遥感探测研究

国家自然科学基金

0+阅读 · 2015年12月31日

多频微波辐射计强降雨条件下海洋风场反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机多智能体系统的协调控制

国家自然科学基金

2+阅读 · 2012年12月31日

高温胁迫下拟南芥miR400及其靶基因PPRP的功能及作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

多智能体系统分布式最优化问题

国家自然科学基金

9+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

由凝胶化嵌段共聚物的共混微相分离制备有机无机杂化聚合物纳米粒子

国家自然科学基金

0+阅读 · 2009年12月31日

碳化硅基陶瓷材料高温相平衡研究

国家自然科学基金

0+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员