跟踪利用未来参考资料信息控制 (Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information) - 专知论文

会员服务 ·

0

INFORMS · 控制器 · 优化器 · Performer · Better ·

2021 年 7 月 20 日

Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information

翻译：跟踪利用未来参考资料信息控制

Jana Mayer,Johannes Westermann,Juan Pedro Gutiérrez H. Muriedas,Uwe Mettin,Alexander Lampe

In recent years, reinforcement learning (RL) has gained increasing attention in control engineering. Especially, policy gradient methods are widely used. In this work, we improve the tracking performance of proximal policy optimization (PPO) for arbitrary reference signals by incorporating information about future reference values. Two variants of extending the argument of the actor and the critic taking future reference values into account are presented. In the first variant, global future reference values are added to the argument. For the second variant, a novel kind of residual space with future reference values applicable to model-free reinforcement learning is introduced. Our approach is evaluated against a PI controller on a simple drive train model. We expect our method to generalize to arbitrary references better than previous approaches, pointing towards the applicability of RL to control real systems.

翻译：近年来,强化学习(RL)在控制工程中日益受到重视,特别是政策梯度方法被广泛使用。在这项工作中,我们通过纳入未来参考值信息,改进了对任意参考信号的准政策优化(PPO)的跟踪工作。介绍了扩展行为者和评论家的论点并考虑未来参考值的两种变式。在第一种变式中,将增加全球未来参考值。在第二种变式中,引入了一种新颖的剩余空间,其中含有适用于无模型强化学习的未来参考值。我们的方法是用一个简单的驱动器模型的PI控制器来评价我们的方法。我们期望我们的方法比以往的方法更好地概括任意引用,指出RL对控制真实系统的适用性。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【圣经书】《强化学习导论(2nd)》电子书与代码，548页pdf

【圣经书】《强化学习导论(2nd)》电子书与代码，548页pdf

专知会员服务

208+阅读 · 2020年5月22日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【Manning2020新书】深度强化学习实战，351页pdf，Deep Reinforcement Learning

【Manning2020新书】深度强化学习实战，351页pdf，Deep Reinforcement Learning

专知会员服务

290+阅读 · 2020年3月10日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

【麻省理工学院课程】MIT 6.S191：Introduction to Deep Learning , 深度学习导论,NSF研究员Alexander Amini

【麻省理工学院课程】MIT 6.S191：Introduction to Deep Learning , 深度学习导论,NSF研究员Alexander Amini

专知会员服务

34+阅读 · 2019年11月2日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

专知

77+阅读 · 2018年8月7日

春节充电系列：李宏毅2017机器学习课程学习笔记31之深度强化学习(deep reinforcement learning)

春节充电系列：李宏毅2017机器学习课程学习笔记31之深度强化学习(deep reinforcement learning)

专知

3+阅读 · 2018年3月21日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Long-Term Exploration in Persistent MDPs

Arxiv

0+阅读 · 2021年9月21日

Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC

Arxiv

0+阅读 · 2021年9月21日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Tracking Noisy Targets: A Review of Recent Object Tracking Approaches

Arxiv

9+阅读 · 2018年2月14日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

VIP会员

文章信息

相关主题

相关VIP内容

【圣经书】《强化学习导论(2nd)》电子书与代码，548页pdf

【圣经书】《强化学习导论(2nd)》电子书与代码，548页pdf

专知会员服务

208+阅读 · 2020年5月22日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【Manning2020新书】深度强化学习实战，351页pdf，Deep Reinforcement Learning

【Manning2020新书】深度强化学习实战，351页pdf，Deep Reinforcement Learning

专知会员服务

290+阅读 · 2020年3月10日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

【麻省理工学院课程】MIT 6.S191：Introduction to Deep Learning , 深度学习导论,NSF研究员Alexander Amini

【麻省理工学院课程】MIT 6.S191：Introduction to Deep Learning , 深度学习导论,NSF研究员Alexander Amini

专知会员服务

34+阅读 · 2019年11月2日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《具备集体态势感知能力的深度强化学习智能体在超视距空战中的应用研究》最新文献

《美军条令文件：频谱管理操作技术》2025最新100页

反制小型无人机：一项重大挑战

《AI作战：将人机协作集成至实时、虚拟与建构环境（LVC）的建模与仿真》

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

【下载】深度强化学习实战书籍和代码《Deep Reinforcement Learning in Action》

专知

77+阅读 · 2018年8月7日

春节充电系列：李宏毅2017机器学习课程学习笔记31之深度强化学习(deep reinforcement learning)

春节充电系列：李宏毅2017机器学习课程学习笔记31之深度强化学习(deep reinforcement learning)

专知

3+阅读 · 2018年3月21日

【推荐】直接未来预测：增强学习监督学习

【推荐】直接未来预测：增强学习监督学习

机器学习研究会

6+阅读 · 2017年11月24日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Long-Term Exploration in Persistent MDPs

Arxiv

0+阅读 · 2021年9月21日

Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC

Arxiv

0+阅读 · 2021年9月21日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Tracking Noisy Targets: A Review of Recent Object Tracking Approaches

Arxiv

9+阅读 · 2018年2月14日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

微信扫码咨询专知VIP会员