全网最全“深度强化学习”学习资料汇总

全网最全“深度强化学习”学习资料汇总

关于本工作

本工作是一项由深度强化学习实验室(Deep Reinforcement Learning Laboratory, DeepRL-Lab)发起的公益项目,总共包含了书籍、课程、环境、算法、应用、开源框架等11部分内容。

文章同步于Github仓库:

Github仓库-A-Guide-Resource-For-DeepRL,欢迎大家Star, Fork和Contribution.


1. Books

  1. Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (2017),Chinese-Edtion, Code
  2. Algorithms for Reinforcement Learning by Csaba Szepesvari (updated 2019)
  3. Deep Reinforcement Learning Hands-On by Maxim Lapan (2018),Code
  4. Reinforcement learning, State-Of-The- Art by Marco Wiering, Martijin van Otterlo
  5. Deep Reinforcement Learning in Action by Alexander Zai and Brandon Brown (in progress)
  6. Grokking Deep Reinforcement Learning by Miguel Morales (in progress)
  7. Multi-Agent Machine Learning A Reinforcement Approach【百度云链接】 by Howard M.Schwartz(2017)
  8. 强化学习在阿里的技术演进与业务创新 by Alibaba Group
  9. Hands-On Reinforcement Learning with Python(百度云链接)
  10. Reinforcement Learning And Optimal Control by Dimitri P. Bertsekas, 2019

2. Courses

  1. UCL Course on RL(★★★) by David Sliver, Video-en,Video-zh
  2. OpenAI's Spinning Up in Deep RL by OpenAI(2018)
  3. Udacity-Deep Reinforcement learning, 2019-10-31
  4. Stanford CS-234: Reinforcement Learning (2019), Videos
  5. DeepMind Advanced Deep Learning & Reinforcement Learning (2018),Videos
  6. GeorgiaTech CS-8803 Deep Reinforcement Learning (2018?)
  7. UC Berkeley CS294-112 Deep Reinforcement Learning (2018 Fall),Video-zh
  8. Deep RL Bootcamp by Berkeley CA(2017)
  9. Thomas Simonini's Deep Reinforcement Learning Course
  10. CS-6101 Deep Reinforcement Learning , NUS SoC, 2018/2019, Semester II
  11. Course on Reinforcement Learning by Alessandro Lazaric,2018
  12. Learn Deep Reinforcement Learning in 60 days

3. Survey-and-Frontier

  1. Deep Reinforcement Learning by Yuxi Li
  2. Algorithms for Reinforcement Learning by Morgan & Claypool, 2009
  3. Modern Deep Reinforcement Learning Algorithms by Sergey Ivanov(54-Page)
  4. Deep Reinforcement Learning: An Overview (2018)
  5. A Brief Survey of Deep Reinforcement Learning (2017)
  6. Deep Reinforcement Learning Doesn't Work Yet(★) by Irpan, Alex(2018), ChineseVersion
  7. Deep Reinforcement Learning that Matters(★) by Peter Henderson1, Riashat Islam1
  8. A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress
  9. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
  10. An Introduction to Deep Reinforcement Learning
  11. Challenges of Real-World Reinforcement Learning
  12. Topics in Reinforcement Learning
  13. Reinforcement Learning: A Survey,1996.
  14. A Tutorial Survey of Reinforcement Learning, Sadhana,1994.
  15. Reinforcement Learning in Robotics, A Survey, 2013
  16. A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation., 2018
  17. Universal Reinforcement Learning Algorithms: Survey and Experiments,2017
  18. Bayesian Reinforcement Learning: A Survey, 2016
  19. Benchmarking Reinforcement Learning Algorithms on Real-World Robots

4. Environment-and-Framework


5. Baselines-and-Benchmarks

  1. github.com/openai/basel 【stalbe-baseline】
  2. rl-baselines-zoo
  3. ROBEL (google-research/robel)
  4. RLBench (stepjam/RLBench)
  5. martin-thoma.com/sota/#
  6. github.com/rlworkgroup/
  7. Atari Environments Scores

6. Algorithms

1. DQN serial

  1. Playing Atari with Deep Reinforcement Learning [arxiv] [code]
  2. Deep Reinforcement Learning with Double Q-learning [arxiv] [code]
  3. Dueling Network Architectures for Deep Reinforcement Learning [arxiv] [code]
  4. Prioritized Experience Replay [arxiv] [code]
  5. Noisy Networks for Exploration [arxiv] [code]
  6. A Distributional Perspective on Reinforcement Learning [arxiv] [code]
  7. Rainbow: Combining Improvements in Deep Reinforcement Learning [arxiv] [code]

2. Others

Algorithm Codeing

  1. Deep-Reinforcement-Learning-Algorithms-with-PyTorch

7. Applications

7.1 Basic

  1. Reinforcement Learning Applications
  2. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control by Hua Wei,Guanjie Zheng(2018)
  3. Deep Reinforcement Learning by Yuxi Li, 2018
  4. Deep Reinforcement Learning in Robotics

7.2 Robotics

  • Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (Kohl, ICRA 2004) [Paper]
  • Robot Motor SKill Coordination with EM-based Reinforcement Learning (Kormushev, IROS 2010) [Paper] [Video]
  • Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (Hester, ICRA 2010) [Paper] [Video]
  • Autonomous Skill Acquisition on a Mobile Manipulator (Konidaris, AAAI 2011) [Paper] [Video]
  • PILCO: A Model-Based and Data-Efficient Approach to Policy Search (Deisenroth, ICML 2011) [Paper]
  • Incremental Semantically Grounded Learning from Demonstration (Niekum, RSS 2013) [Paper]
  • Efficient Reinforcement Learning for Robots using Informative Simulated Priors (Cutler, ICRA 2015) [Paper] [Video]
  • Robots that can adapt like animals (Cully, Nature 2015) [Paper] [Video] [Code]
  • Black-Box Data-efficient Policy Search for Robotics (Chatzilygeroudis, IROS 2017) [Paper] [Video] [Code]

8. Advanced-Topics

8.1. Model-free RL

  1. playing atari with deep reinforcement learning NIPS Deep Learning Workshop 2013. paper
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
  2. Human-level control through deep reinforcement learning Nature 2015. paper
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis
  3. Deep Reinforcement Learning with Double Q-learning AAAI 16. paper
    Hado van Hasselt, Arthur Guez, David Silver
  4. Dueling Network Architectures for Deep Reinforcement Learning ICML16. paper
    Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
  5. Deep Recurrent Q-Learning for Partially Observable MDPs AAA15. paper
    Matthew Hausknecht, Peter Stone
  6. Prioritized Experience Replay ICLR 2016. paper
    Tom Schaul, John Quan, Ioannis Antonoglou, David Silver
  7. Asynchronous Methods for Deep Reinforcement Learning ICML2016. paper
    Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
  8. A Distributional Perspective on Reinforcement Learning ICML2017. paper
    Marc G. Bellemare, Will Dabney, Rémi Munos
  9. Noisy Networks for Exploration ICLR2018. paper
    Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
  10. Rainbow: Combining Improvements in Deep Reinforcement Learning AAAI2018. paper
    Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver

8.2. Model-based RL

  1. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion NIPS2018. paper
    Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
  2. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning ICML2018.paper
    Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine
  3. Value Prediction Network NIPS2017. paper
    Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine
  4. Imagination-Augmented Agents for Deep Reinforcement Learning NIPS2017. paper
    Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
  5. Continuous Deep Q-Learning with Model-based Acceleration ICML2016. paper
    Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
  6. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning CoRL2017. paper
    Gabriel Kalweit, Joschka Boedecker
  7. Model-Ensemble Trust-Region Policy Optimization ICLR2018. paper
    Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel
  8. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models NIPS2018. paper
    Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
  9. Dyna, an integrated architecture for learning, planning, and reacting ACM1991. paper
    Sutton, Richard S
  10. Learning Continuous Control Policies by Stochastic Value Gradients NIPS 2015. paper
    Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa, Tom Erez
  11. Imagination-Augmented Agents for Deep Reinforcement Learning NIPS 2017. paper
    Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
  12. Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks ICLR 2017. paper
    Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

8.3 Function Approximation methods (Least-Square Temporal Difference, Least-Square Policy Iteration)

  • Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, 1996. [Paper]
  • Model-Free Least Squares Policy Iteration, NIPS, 2001. [Paper] [Code]

8.4 Policy Search/Policy Gradient

  • Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999. [Paper]
  • Natural Actor-Critic, ECML, 2005. [Paper]
  • Policy Search for Motor Primitives in Robotics, NIPS, 2009. [Paper]
  • Relative Entropy Policy Search, AAAI, 2010. [Paper]
  • Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, 2012. [Paper]
  • Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004. [Paper]
  • PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011. [Paper]
  • Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011. [Paper]
  • Black-Box Data-efficient Policy Search for Robotics, IROS, 2017. [Paper]

8.5 Hierarchical RL

  • Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning, Artificial Intelligence, 1999. [Paper]
  • Building Portable Options: Skill Transfer in Reinforcement Learning, IJCAI, 2007. [Paper]

8.6 Inverse RL

  1. updating..........

8.7 Meta RL

  1. updating..........

8.8. Rewards

  1. Deep Reinforcement Learning Models: Tips & Tricks for Writing Reward Functions
  2. Meta Reward Learning

8.9. Policy Gradient

  1. Policy Gradient

8.10. Distributed Reinforcement Learning

  1. Asynchronous Methods for Deep Reinforcement Learning by ICML 2016.paper
  2. GA3C: GPU-based A3C for Deep Reinforcement Learning by Iuri Frosio, Stephen Tyree, NIPS 2016
  3. Distributed Prioritized Experience Replay by Dan Horgan, John Quan, David Budden,ICLR 2018
  4. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures by Lasse Espeholt, Hubert Soyer, Remi Munos ,ICML 2018
  5. Distributed Distributional Deterministic Policy Gradients by Gabriel Barth-Maron, Matthew W. Hoffman, ICLR 2018.
  6. Emergence of Locomotion Behaviours in Rich Environments by Nicolas Heess, Dhruva TB, Srinivasan Sriram, 2017
  7. GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning by Jacky Liang, Viktor Makoviychuk, 2018
  8. Recurrent Experience Replay in Distributed Reinforcement Learning bySteven Kapturowski, Georg Ostrovski, ICLR 2019.

9. Relate-Coureses

9.1. Game Theory

  1. Game Theory Course, Yale University
  2. Game Theory - The Full Course, Stanford University
  3. Algorithmic Game Theory (CS364A, Fall 2013) , Stanford University

9.2. other

......


10. Multi-Agents

10.1 Tutorial and Books

10.2 Review Papers

10.3 Framework papers

10.4 Joint action learning

10.5 Cooperation and competition

10.6 Coordination

10.7 Security

10.8 Self-Play

10.9 Learning To Communicate

10.10 Transfer Learning

10.11 Imitation and Inverse Reinforcement Learning

10.12 Meta Learning

10.13 Application


11. Paper-Resources

2019-07

Jun

April-May

March 2019

Feb 2019

Jan 2019

2018

  • Accelerated Methods for Deep Reinforcement Learning. arxiv
  • A Deep Reinforcement Learning Chatbot (Short Version). arxiv
  • AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search. arxiv :star:
  • A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress. arxiv
  • Composable Deep Reinforcement Learning for Robotic Manipulation. arxiv
  • Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication. arxiv
  • Deep Reinforcement Fuzzing. arxiv
  • Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis. arxiv
  • Deep Reinforcement Learning For Sequence to Sequence Models. arxiv code
  • Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods. arxiv
  • Deep Reinforcement Learning in Portfolio Management. arxiv code
  • Deep Reinforcement Learning using Capsules in Advanced Game Environments. arxiv
  • Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft. arxiv
  • Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes. arxiv code
  • Diversity is All You Need: Learning Skills without a Reward Function. arxiv
  • Faster Deep Q-learning using Neural Episodic Control. arxiv
  • Feedback-Based Tree Search for Reinforcement Learning. arxiv
  • Feudal Reinforcement Learning for Dialogue Management in Large Domains. arxiv
  • Forward-Backward Reinforcement Learning. arxiv
  • Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies. arxiv
  • IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. arxiv
  • Kickstarting Deep Reinforcement Learning. arxiv
  • Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. arxiv
  • Meta Reinforcement Learning with Latent Variable Gaussian Processes. arxiv
  • Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches. arxiv
  • Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations. arxiv
  • Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents. arxiv
  • Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning. arxiv
  • Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. arxiv
  • Reinforcement Learning from Imperfect Demonstrations. arxiv
  • Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. arxiv
  • RUDDER: Return Decomposition for Delayed Rewards. arxiv code
  • Semi-parametric Topological Memory for Navigation. arxiv tensorflow
  • Shared Autonomy via Deep Reinforcement Learning. arxiv
  • Setting up a Reinforcement Learning Task with a Real-World Robot. arxiv
  • Simple random search provides a competitive approach to reinforcement learning. arxiv code
  • Unsupervised Meta-Learning for Reinforcement Learning. arxiv
  • Using reinforcement learning to learn how to play text-based games. arxiv

..............受字数限制,更多请查看仓库

More About

These documents will be updated in sync with my personal blog and knowledge column 1. CSDN博客: A Guide Resource for Deep Reinforcement Learning
2. ZhiHu专栏: A Guide Resource for Deep Reinforcement Learning
3. 微信公众号


Cite

Based on the above information, we have made a comprehensive summary of the deep reinforcement of learning materials, and we would like to express our heartfelt thanks to them.

[1].github.com/brianspierin
[2].github.com/jgvictores/a
[3].github.com/PaddlePaddle
[4].github.com/LantaoYu/MAR
[5].github.com/gopala-kr/DR
[6].github.com/junhyukoh/de
[7].eff.org/ai/metrics#
[8].agi.university/the-land
[9].github.com/tigerneil/aw
[10].planspace.org/20170830-
[11].aikorea.org/awesome-rl/
[12].github.com/junhyukoh/de

编辑于 2020-05-13 18:49