超双曲折扣和跨多地平线学习 (Hyperbolic Discounting and Learning over Multiple Horizons) - 专知论文

会员服务 ·

0

衰减系数 · 学成 · 分解的 · Rainbow · 泛函 ·

2019 年 2 月 20 日

Hyperbolic Discounting and Learning over Multiple Horizons

翻译：超双曲折扣和跨多地平线学习

William Fedus,Carles Gelada,Yoshua Bengio,Marc G. Bellemare,Hugo Larochelle

Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process. The discount factor values future rewards by an exponential scheme that leads to theoretical convergence guarantees of the Bellman equation. However, evidence from psychology, economics and neuroscience suggests that humans and animals instead have hyperbolic time-preferences. In this work we revisit the fundamentals of discounting in RL and bridge this disconnect by implementing an RL agent that acts via hyperbolic discounting. We demonstrate that a simple approach approximates hyperbolic discount functions while still using familiar temporal-difference learning techniques in RL. Additionally, and independent of hyperbolic discounting, we make a surprising discovery that simultaneously learning value functions over multiple time-horizons is an effective auxiliary task which often improves over a strong value-based RL agent, Rainbow.

翻译：强化学习(RL)通常将折扣系数定义为Markov决定程序的一部分。折扣系数通过指数式计划将未来奖励值作为Bellman方程式的理论趋同保证值。然而,心理学、经济学和神经科学的证据表明,人类和动物反而具有双曲时间偏差。在这项工作中,我们重新审视了RL的折扣基本原理,并通过执行一个通过双曲折扣运作的RL代理来弥补这一脱节。我们证明,一个简单的方法近似超双曲折扣功能,同时仍然在RL使用熟悉的时间偏差学习技术。此外,并且独立于超曲折扣,我们令人惊讶地发现,同时学习价值功能的多重时间偏差是一个有效的辅助任务,常常比一个强大的基于价值的RL代理(彩虹)有所改进。

0

相关内容

衰减系数

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

201+阅读 · 2020年7月5日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

105+阅读 · 2020年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

176+阅读 · 2020年2月1日

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

专知会员服务

112+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

130+阅读 · 2020年3月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

12+阅读 · 2017年9月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Attributed Network Embedding via Subspace Discovery

Arxiv

4+阅读 · 2019年1月14日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

End-to-End Learning for Answering Structured Queries Directly over Text

Arxiv

3+阅读 · 2018年11月16日

Learning under Misspecified Objective Spaces

Arxiv

3+阅读 · 2018年10月11日

Hyperbolic Attention Networks

Arxiv

8+阅读 · 2018年5月24日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

VIP会员

文章信息

相关主题

相关VIP内容

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

201+阅读 · 2020年7月5日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

105+阅读 · 2020年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

176+阅读 · 2020年2月1日

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

专知会员服务

112+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

热门VIP内容

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

130+阅读 · 2020年3月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

12+阅读 · 2017年9月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Attributed Network Embedding via Subspace Discovery

Arxiv

4+阅读 · 2019年1月14日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

End-to-End Learning for Answering Structured Queries Directly over Text

Arxiv

3+阅读 · 2018年11月16日

Learning under Misspecified Objective Spaces

Arxiv

3+阅读 · 2018年10月11日

Hyperbolic Attention Networks

Arxiv

8+阅读 · 2018年5月24日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

微信扫码咨询专知VIP会员