为高效和可扩展的政策搜索实现最佳优化 (Cautious Bayesian Optimization for Efficient and Scalable Policy Search) - 专知论文

会员服务 ·

0

策略搜索 · 优化器 · 可约的 · 求逆 · Performer ·

2020 年 11 月 18 日

Cautious Bayesian Optimization for Efficient and Scalable Policy Search

翻译：为高效和可扩展的政策搜索实现最佳优化

Lukas P. Fröhlich,Melanie N. Zeilinger,Edgar D. Klenske

Sample efficiency is one of the key factors when applying policy search to real-world problems. In recent years, Bayesian Optimization (BO) has become prominent in the field of robotics due to its sample efficiency and little prior knowledge needed. However, one drawback of BO is its poor performance on high-dimensional search spaces as it focuses on global search. In the policy search setting, local optimization is typically sufficient as initial policies are often available, e.g., via meta-learning, kinesthetic demonstrations or sim-to-real approaches. In this paper, we propose to constrain the policy search space to a sublevel-set of the Bayesian surrogate model's predictive uncertainty. This simple yet effective way of constraining the policy update enables BO to scale to high-dimensional spaces (>100) as well as reduces the risk of damaging the system. We demonstrate the effectiveness of our approach on a wide range of problems, including a motor skills task, adapting deep RL agents to new reward signals and a sim-to-real task for an inverted pendulum system.

翻译：抽样效率是将政策搜索应用于现实世界问题的关键因素之一。近年来,Bayesian优化(BO)由于其抽样效率以及以前所需的知识很少,在机器人领域变得显着。然而,BO的一个缺点是,它在以全球搜索为重点的高维搜索空间表现不佳。在政策搜索环境中,地方优化一般是足够的,因为最初的政策往往可以利用,例如,通过元学习、传教示范或模拟到现实的方法。在本文中,我们提议将政策搜索空间限制在Bayesian代孕模型的子层的预测不确定性中。这种简单而有效的限制政策更新的方法使得BO能够向高维空间(>100)进行推广,并减少破坏系统的风险。我们展示了我们处理广泛问题的方法的有效性,包括机动技能任务,使深RL剂适应新的奖励信号,以及倒置式支架系统的一个模拟到现实的任务。

0

相关内容

策略搜索

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

47+阅读 · 2020年5月26日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

13+阅读 · 2020年5月19日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

56+阅读 · 2020年5月9日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

113+阅读 · 2020年4月5日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

80+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

176+阅读 · 2020年2月1日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

41+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

56+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Good practices for Bayesian Optimization of high dimensional structured spaces

Arxiv

0+阅读 · 2021年1月6日

A Survey of Deep RL and IL for Autonomous Driving Policy Learning

Arxiv

0+阅读 · 2021年1月6日

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Arxiv

0+阅读 · 2021年1月6日

Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年1月4日

Self-Directed Online Machine Learning for Topology Optimization

Arxiv

0+阅读 · 2021年1月4日

Policy-Aware Model Learning for Policy Gradient Methods

Arxiv

0+阅读 · 2021年1月4日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Neural Architecture Search: A Survey

Arxiv

12+阅读 · 2018年9月5日

A survey on policy search algorithms for learning robot controllers in a handful of trials

Arxiv

3+阅读 · 2018年7月6日

VIP会员

文章信息

相关主题

相关VIP内容

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

47+阅读 · 2020年5月26日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

13+阅读 · 2020年5月19日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

56+阅读 · 2020年5月9日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

113+阅读 · 2020年4月5日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

80+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

176+阅读 · 2020年2月1日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

41+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

56+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

热门VIP内容

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Good practices for Bayesian Optimization of high dimensional structured spaces

Arxiv

0+阅读 · 2021年1月6日

A Survey of Deep RL and IL for Autonomous Driving Policy Learning

Arxiv

0+阅读 · 2021年1月6日

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Arxiv

0+阅读 · 2021年1月6日

Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年1月4日

Self-Directed Online Machine Learning for Topology Optimization

Arxiv

0+阅读 · 2021年1月4日

Policy-Aware Model Learning for Policy Gradient Methods

Arxiv

0+阅读 · 2021年1月4日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Neural Architecture Search: A Survey

Arxiv

12+阅读 · 2018年9月5日

A survey on policy search algorithms for learning robot controllers in a handful of trials

Arxiv

3+阅读 · 2018年7月6日

微信扫码咨询专知VIP会员