Reinforcement Learning: An Introduction 2018第二版 500页

2018 年 4 月 27 日 CreateAMind
Reinforcement Learning: An Introduction 2018第二版 500页

http://incompleteideas.net/book/the-book-2nd.html


https://pan.baidu.com/s/1Z2SFNhtDAldSvgVZHOiyiw 或阅读原文访问








登录查看更多
9

相关内容

强化学习(RL)是机器学习的一个领域,与软件代理应如何在环境中采取行动以最大化累积奖励的概念有关。除了监督学习和非监督学习外,强化学习是三种基本的机器学习范式之一。 强化学习与监督学习的不同之处在于,不需要呈现带标签的输入/输出对,也不需要显式纠正次优动作。相反,重点是在探索(未知领域)和利用(当前知识)之间找到平衡。 该环境通常以马尔可夫决策过程(MDP)的形式陈述,因为针对这种情况的许多强化学习算法都使用动态编程技术。经典动态规划方法和强化学习算法之间的主要区别在于,后者不假设MDP的确切数学模型,并且针对无法采用精确方法的大型MDP。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The-Regularized-Leader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the proofs have been carefully chosen to be as simple and as short as possible.

0
14
下载
预览

强化学习导论第二版全新出炉。本书,由麻省理工大学出版社出版,预计于11月开印。它的电子版目前已经被作者公开,让我们可以在出版前,抢先阅读。

下载链接:https://pan.baidu.com/s/1BMy9seCGx_SlTHZRhpfdlA 密码:ka1a

成为VIP会员查看完整内容
0
77

*《Stabilizing Transformers for Reinforcement Learning》E Parisotto, H. F Song, J W. Rae, R Pascanu, C Gulcehre, S M. Jayakumar, M Jaderberg, R L Kaufman, A Clark, S Noury, M M. Botvinick, N Heess, R Hadsell [DeepMind] (2019)

成为VIP会员查看完整内容
0
22

Deep reinforcement learning suggests the promise of fully automated learning of robotic control policies that directly map sensory inputs to low-level actions. However, applying deep reinforcement learning methods on real-world robots is exceptionally difficult, due both to the sample complexity and, just as importantly, the sensitivity of such methods to hyperparameters. While hyperparameter tuning can be performed in parallel in simulated domains, it is usually impractical to tune hyperparameters directly on real-world robotic platforms, especially legged platforms like quadrupedal robots that can be damaged through extensive trial-and-error learning. In this paper, we develop a stable variant of the soft actor-critic deep reinforcement learning algorithm that requires minimal hyperparameter tuning, while also requiring only a modest number of trials to learn multilayer neural network policies. This algorithm is based on the framework of maximum entropy reinforcement learning, and automatically trades off exploration against exploitation by dynamically and automatically tuning a temperature parameter that determines the stochasticity of the policy. We show that this method achieves state-of-the-art performance on four standard benchmark environments. We then demonstrate that it can be used to learn quadrupedal locomotion gaits on a real-world Minitaur robot, learning to walk from scratch directly in the real world in two hours of training.

0
4
下载
预览

This paper presents a new multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We propose the use of linear and non-linear methods to develop the MODRL framework that includes both single-policy and multi-policy strategies. The experimental results on two benchmark problems including the two-objective deep sea treasure environment and the three-objective mountain car problem indicate that the proposed framework is able to converge to the optimal Pareto solutions effectively. The proposed framework is generic, which allows implementation of different deep reinforcement learning algorithms in different complex environments. This therefore overcomes many difficulties involved with standard multi-objective reinforcement learning (MORL) methods existing in the current literature. The framework creates a platform as a testbed environment to develop methods for solving various problems associated with the current MORL. Details of the framework implementation can be referred to http://www.deakin.edu.au/~thanhthi/drl.htm.

0
9
下载
预览

In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This chapter reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework.

0
12
下载
预览
小贴士
相关资讯
相关VIP内容
专知会员服务
79+阅读 · 2020年2月1日
【论文】欺骗学习(Learning by Cheating)
专知会员服务
15+阅读 · 2020年1月3日
【强化学习资源集合】Awesome Reinforcement Learning
专知会员服务
41+阅读 · 2019年12月23日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
22+阅读 · 2019年10月17日
MIT新书《强化学习与最优控制》
专知会员服务
116+阅读 · 2019年10月9日
相关论文
A Modern Introduction to Online Learning
Francesco Orabona
14+阅读 · 2019年12月31日
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Yiding Jiang,Shixiang Gu,Kevin Murphy,Chelsea Finn
3+阅读 · 2019年6月18日
Tuomas Haarnoja,Aurick Zhou,Sehoon Ha,Jie Tan,George Tucker,Sergey Levine
4+阅读 · 2018年12月26日
Ziwei Zhang,Peng Cui,Wenwu Zhu
38+阅读 · 2018年12月11日
Andreas Kamilaris,Francesc X. Prenafeta-Boldu
8+阅读 · 2018年7月31日
CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving
Xiaodan Liang,Tairui Wang,Luona Yang,Eric Xing
4+阅读 · 2018年7月10日
A Multi-Objective Deep Reinforcement Learning Framework
Thanh Thi Nguyen
9+阅读 · 2018年6月27日
Seyed Sajad Mousavi,Michael Schukat,Enda Howley
12+阅读 · 2018年6月23日
Abhishek Gupta,Benjamin Eysenbach,Chelsea Finn,Sergey Levine
6+阅读 · 2018年6月12日
Ermo Wei,Drew Wicke,David Freelan,Sean Luke
10+阅读 · 2018年4月25日
Top