【深度强化学习教程】高质量PyTorch实现集锦


【导读】包含用PyTorch语言编写的深度强化学习算法的高质量实现。



作者:这些IPython笔记本的目的主要是帮助我练习和理解我读过的论文;因此,在某些情况下,我将选择可读性而不是效率。首先,我会上传论文的实现,然后是标记来解释代码的每一部分。


相关论文



  1. Human Level Control Through Deep Reinforement Learning

     [Publication] https://deepmind.com/research/publications/human-level-control-through-deep-reinforcement-learning/

     [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

  2. Multi-Step Learning (from Reinforcement Learning: An Introduction, Chapter 7) 

    [Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/02.NStep_DQN.ipynb

  3. Deep Reinforcement Learning with Double Q-learning 

    [Publication] https://arxiv.org/abs/1509.06461

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/03.Double_DQN.ipynb

  4. Dueling Network Architectures for Deep Reinforcement Learning 

    [Publication] https://arxiv.org/abs/1511.06581

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

  5. Noisy Networks for Exploration 

    [Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/05.DQN-NoisyNets.ipynb

  6. Prioritized Experience Replay 

    [Publication] https://arxiv.org/abs/1511.05952?context=cs

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/06.DQN_PriorityReplay.ipynb

  7. A Distributional Perspective on Reinforcement Learning 

    [Publication] https://arxiv.org/abs/1707.06887

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/07.Categorical-DQN.ipynb

  8. Rainbow: Combining Improvements in Deep Reinforcement Learning 

    [Publication] https://arxiv.org/abs/1710.02298

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/08.Rainbow.ipynb

  9. Distributional Reinforcement Learning with Quantile Regression 

    [Publication] https://arxiv.org/abs/1710.10044

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/09.QuantileRegression-DQN.ipynb

  10. Rainbow with Quantile Regression 

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/10.Quantile-Rainbow.ipynb

  11. Deep Recurrent Q-Learning for Partially Observable MDPs 

    [Publication] https://arxiv.org/abs/1507.06527

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/11.DRQN.ipynb

  12. Advantage Actor Critic (A2C) 

    [Publication1] https://arxiv.org/abs/1602.01783

    [Publication2] https://blog.openai.com/baselines-acktr-a2c/

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/12.A2C.ipynb

  13. High-Dimensional Continuous Control Using Generalized Advantage Estimation 

    [Publication] https://arxiv.org/abs/1506.02438

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/13.GAE.ipynb

  14. Proximal Policy Optimization Algorithms 

    [Publication] https://arxiv.org/abs/1707.06347

    [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/14.PPO.ipynb


PyTorch实现



DeepRL-Tutorials

https://github.com/qfettes/DeepRL-Tutorials


-END-

专 · 知


欢迎微信扫描下方二维码加入专知人工智能知识星球群,获取更多人工智能领域专业知识教程视频资料和与专家交流咨询!



登录www.zhuanzhi.ai或者点击阅读原文,使用专知,可获取更多AI知识资料!


专知运用有多个深度学习主题群,欢迎各位添加专知小助手微信(下方二维码)进群交流(请备注主题类型:AI、NLP、CV、 KG等)

 AI 项目技术 & 商务合作:bd@zhuanzhi.ai, 或扫描上面二维码联系!

请关注专知公众号,获取人工智能的专业知识!

点击“阅读原文”,使用专知

展开全文
Top
微信扫码咨询专知VIP会员