先前的特性 (Predecessor Features) - 专知论文

会员服务 ·

0

Learning · Performer · 迹 · 可辨认的 · 情景 ·

2022 年 7 月 25 日

Predecessor Features

翻译：先前的特性

Duncan Bailey,Marcelo G. Mattar

from arxiv, Accepted to RLDM 2022

Any reinforcement learning system must be able to identify which past events contributed to observed outcomes, a problem known as credit assignment. A common solution to this problem is to use an eligibility trace to assign credit to recency-weighted set of experienced events. However, in many realistic tasks, the set of recently experienced events are only one of the many possible action events that could have preceded the current outcome. This suggests that reinforcement learning can be made more efficient by allowing credit assignment to any viable preceding state, rather than only those most recently experienced. Accordingly, we examine ``Predecessor Features'', the fully bootstrapped version of van Hasselt's ``Expected Trace'', an algorithm that achieves this richer form of credit assignment. By maintaining a representation that approximates the expected sum of past occupancies, this algorithm allows temporal difference (TD) errors to be propagated accurately to a larger number of predecessor states than conventional methods, greatly improving learning speed. The algorithm can also be naturally extended from tabular state representation to feature representations allowing for increased performance on a wide range of environments. We demonstrate several use cases for Predecessor Features and compare its performance with other approaches.

翻译：任何强化学习系统都必须能够确定哪些过去的事件促成了观察到的结果,即所谓的信用分配问题。这个问题的一个共同解决办法是使用资格追踪来分配信用,以支付一系列累累事件。然而,在许多现实的任务中,最近经历的一系列事件仅仅是在目前结果之前可能发生的许多可能的行动事件之一。这表明,如果允许将信用转让给任何具有生存能力的先前国家,而不是仅仅允许最近经历的情况,那么强化学习可以提高效率。因此,我们审查“先期状况特征”,即范哈塞尔特的“特快路径”的完整版本,即实现这种较丰富形式信用分配的算法。通过保持一种接近过去偏好情况的预期总和,这种算法允许将时间差(TD)错误准确地传播给更多的被继承国,而不是常规方法,大大提高学习速度。算法也可以自然地从“先期状态表现”扩展为特征表现,允许在广泛的环境中提高性能。我们用了一些例子来说明前期状况,并将它与其他方法进行比较。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

Nd/Y:Sc2SiO5晶体热效应补偿获得高稳定性双波长激光研究

国家自然科学基金

0+阅读 · 2014年12月31日

高功率光载微波脉冲的产生及其大气中相位噪声特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于酸敏感离子通道调控视网膜神经节细胞损伤过程的青光眼发病机制及治疗研究

国家自然科学基金

0+阅读 · 2012年12月31日

脉冲调制等离子体增强原子层刻蚀多尺度研究

国家自然科学基金

0+阅读 · 2012年12月31日

153Gd-DOTA-Octreotide MR/SPECT单核心双模态小分子探针构建及人肝细胞癌/肺癌裸鼠双瘤模型定量显像研究

国家自然科学基金

0+阅读 · 2012年12月31日

宽馏分含氧燃料低温燃烧基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

以超痕量肿瘤标志物检测为目标的长碳链抗体共价偶联纳米金生物荧光探针

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach

Arxiv

1+阅读 · 2022年9月19日

Two-stage Modeling for Prediction with Confidence

Arxiv

0+阅读 · 2022年9月19日

Learned Sorted Table Search and Static Indexes in Small Model Space

Arxiv

0+阅读 · 2022年9月17日

Self-Optimizing Feature Transformation

Arxiv

0+阅读 · 2022年9月16日

Two case studies on implementing best practices for Software Process Improvement

Arxiv

0+阅读 · 2022年9月15日

Learning to Exploit Elastic Actuators for Quadruped Locomotion

Arxiv

0+阅读 · 2022年9月15日

Advances in Multi-turn Dialogue Comprehension: A Survey

Arxiv

23+阅读 · 2021年10月11日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《超视距空战强化学习智能体的深度学习表征能力评估》最新70页

《第一人称视角无人机革命及其对陆战与其它战争维度的影响》最新19页报告

从兵棋推演到真实战场：人工智能指挥官在实战中的崛起

《小型无人机系统空域管理与控制：美陆军指挥官手册》最新34页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach

Arxiv

1+阅读 · 2022年9月19日

Two-stage Modeling for Prediction with Confidence

Arxiv

0+阅读 · 2022年9月19日

Learned Sorted Table Search and Static Indexes in Small Model Space

Arxiv

0+阅读 · 2022年9月17日

Self-Optimizing Feature Transformation

Arxiv

0+阅读 · 2022年9月16日

Two case studies on implementing best practices for Software Process Improvement

Arxiv

0+阅读 · 2022年9月15日

Learning to Exploit Elastic Actuators for Quadruped Locomotion

Arxiv

0+阅读 · 2022年9月15日

Advances in Multi-turn Dialogue Comprehension: A Survey

Arxiv

23+阅读 · 2021年10月11日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

相关基金

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

Nd/Y:Sc2SiO5晶体热效应补偿获得高稳定性双波长激光研究

国家自然科学基金

0+阅读 · 2014年12月31日

高功率光载微波脉冲的产生及其大气中相位噪声特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于酸敏感离子通道调控视网膜神经节细胞损伤过程的青光眼发病机制及治疗研究

国家自然科学基金

0+阅读 · 2012年12月31日

脉冲调制等离子体增强原子层刻蚀多尺度研究

国家自然科学基金

0+阅读 · 2012年12月31日

153Gd-DOTA-Octreotide MR/SPECT单核心双模态小分子探针构建及人肝细胞癌/肺癌裸鼠双瘤模型定量显像研究

国家自然科学基金

0+阅读 · 2012年12月31日

宽馏分含氧燃料低温燃烧基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

以超痕量肿瘤标志物检测为目标的长碳链抗体共价偶联纳米金生物荧光探针

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员