在连续行动空间内为蒙特卡洛规划制定有限的以深度土匪为基础的持续行动空间规划战略 (Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces) - 专知论文

会员服务 ·

0

Continuity · 蒙特卡罗 · 优化器 · 控制器 · 赌博机/老虎机 ·

2021 年 6 月 29 日

Limited depth bandit-based strategy for Monte Carlo planning in continuous action spaces

翻译：在连续行动空间内为蒙特卡洛规划制定有限的以深度土匪为基础的持续行动空间规划战略

Ricardo Quinteiro,Francisco S. Melo,Pedro A. Santos

This paper addresses the problem of optimal control using search trees. We start by considering multi-armed bandit problems with continuous action spaces and propose LD-HOO, a limited depth variant of the hierarchical optimistic optimization (HOO) algorithm. We provide a regret analysis for LD-HOO and show that, asymptotically, our algorithm exhibits the same cumulative regret as the original HOO while being faster and more memory efficient. We then propose a Monte Carlo tree search algorithm based on LD-HOO for optimal control problems and illustrate the resulting approach's application in several optimal control problems.

翻译：本文探讨使用搜索树进行最佳控制的问题。我们首先考虑多武装强盗问题, 包括连续行动空间, 并提出LD- HOO, 这是等级乐观优化(HOO)算法的有限深度变量。我们为LD- HOO提供了遗憾分析, 并表明我们的算法与原HOO一样, 累积了同样的遗憾, 同时速度更快, 记忆效率更高。然后我们提出一个基于LD- HOO的蒙特卡洛树搜索算法, 以优化控制问题, 并演示由此产生的方法在若干最佳控制问题中的应用。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

11+阅读 · 2019年4月26日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

Arxiv

0+阅读 · 2021年8月31日

Learning Practically Feasible Policies for Online 3D Bin Packing

Arxiv

0+阅读 · 2021年8月31日

Towards Optimally Efficient Search with Deep Learning for Large-Scale MIMO Systems

Arxiv

0+阅读 · 2021年8月29日

An Introduction to Hamiltonian Monte Carlo Method for Sampling

Arxiv

0+阅读 · 2021年8月27日

Robust Motion Planning in the Presence of Estimation Uncertainty

Arxiv

0+阅读 · 2021年8月26日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

11+阅读 · 2019年4月26日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

Arxiv

0+阅读 · 2021年8月31日

Learning Practically Feasible Policies for Online 3D Bin Packing

Arxiv

0+阅读 · 2021年8月31日

Towards Optimally Efficient Search with Deep Learning for Large-Scale MIMO Systems

Arxiv

0+阅读 · 2021年8月29日

An Introduction to Hamiltonian Monte Carlo Method for Sampling

Arxiv

0+阅读 · 2021年8月27日

Robust Motion Planning in the Presence of Estimation Uncertainty

Arxiv

0+阅读 · 2021年8月26日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员