分配连续控制中量化选择的误差 (Invariance to Quantile Selection in Distributional Continuous Control) - 专知论文

会员服务 ·

0

Continuity · 控制器 · 离散化 · 不变 · 情景 ·

2022 年 12 月 29 日

Invariance to Quantile Selection in Distributional Continuous Control

翻译：分配连续控制中量化选择的误差

Felix Grün,Muhammad Saif-ur-Rehman,Tobias Glasmachers,Ioannis Iossifidis

In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.

翻译：近些年来,分布强化学习产生了许多最新成果。越来越多的不同行动域的高效分布算法在一段时间内得到了发展,主要变化在于它们如何将其价值分布近似值的参数化,以及它们如何量化这些分布之间的差异。在这项工作中,我们通过向分布批评者推广两种强大的行为者-批评算法(TD3和SAC),将最著名和最成功的三种算法(QR-DQN、IQN和FQF)转移到持续行动域。我们调查分离行动域方法的相对性能是否转化为持续案例。为此,我们用经验比较了在一系列连续控制任务执行的圆柱上的情况。我们的结果显示,分配原子的数量和放置在确定性、持续的行动设置中存在质的差异。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

层层自组装构筑三维结构石墨烯纳米复合薄膜及其电化学应用

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

介孔二氧化硅/石墨烯三明治层状材料与贵金属纳米簇构建多功能免疫传感器

国家自然科学基金

0+阅读 · 2012年12月31日

老化相关的alpha-突触核蛋白寡聚体积聚对海马神经元NMDA受体表达和功能的影响及其机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

小发夹RNA对板层状鱼鳞病TGM1突变基因表达的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

复杂大化工过程的分布式广义预测控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

非晶态金属氧化物透明TFT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于界面反应调控的金属氧化物空心球的构筑新方法

国家自然科学基金

0+阅读 · 2012年12月31日

具有良好NLO性的含[TpMS3]配体簇合物的设计及组装

国家自然科学基金

0+阅读 · 2009年12月31日

The Cost of Training Machine Learning Models over Distributed Data Sources

Arxiv

0+阅读 · 2023年2月28日

Differentially Private Distributed Convex Optimization

Arxiv

0+阅读 · 2023年2月28日

Out-of-Distribution Representation Learning for Time Series Classification

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

Suspension Analysis and Selective Continuation-Passing Style for Higher-Order Probabilistic Programming Languages

Arxiv

0+阅读 · 2023年2月25日

Knock Out 2PC with Practicality Intact: a High-performance and General Distributed Transaction Protocol

Arxiv

0+阅读 · 2023年2月24日

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing

Arxiv

0+阅读 · 2023年2月24日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌战争：无人机作战

《人类就绪水平在人工智能密集型系统中的应用》最新文献

《迈向全自主超轻型无人机》2025最新124页

《自动化战略情报治理》最新文献

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

The Cost of Training Machine Learning Models over Distributed Data Sources

Arxiv

0+阅读 · 2023年2月28日

Differentially Private Distributed Convex Optimization

Arxiv

0+阅读 · 2023年2月28日

Out-of-Distribution Representation Learning for Time Series Classification

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

Suspension Analysis and Selective Continuation-Passing Style for Higher-Order Probabilistic Programming Languages

Arxiv

0+阅读 · 2023年2月25日

Knock Out 2PC with Practicality Intact: a High-performance and General Distributed Transaction Protocol

Arxiv

0+阅读 · 2023年2月24日

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing

Arxiv

0+阅读 · 2023年2月24日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

相关基金

层层自组装构筑三维结构石墨烯纳米复合薄膜及其电化学应用

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

介孔二氧化硅/石墨烯三明治层状材料与贵金属纳米簇构建多功能免疫传感器

国家自然科学基金

0+阅读 · 2012年12月31日

老化相关的alpha-突触核蛋白寡聚体积聚对海马神经元NMDA受体表达和功能的影响及其机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

小发夹RNA对板层状鱼鳞病TGM1突变基因表达的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

复杂大化工过程的分布式广义预测控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

非晶态金属氧化物透明TFT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于界面反应调控的金属氧化物空心球的构筑新方法

国家自然科学基金

0+阅读 · 2012年12月31日

具有良好NLO性的含[TpMS3]配体簇合物的设计及组装

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员