马尔科夫决策过程政策迭接的平滑复杂性 (The Smoothed Complexity of Policy Iteration for Markov Decision Processes) - 专知论文

会员服务 ·

0

Markov · 策略迭代 · 平滑 · Processing（编程语言） · 相互独立的 ·

2022 年 11 月 30 日

The Smoothed Complexity of Policy Iteration for Markov Decision Processes

翻译：马尔科夫决策过程政策迭接的平滑复杂性

Miranda Christ,Mihalis Yannakakis

We show subexponential lower bounds (i.e., $2^{\Omega (n^c)}$) on the smoothed complexity of the classical Howard's Policy Iteration algorithm for Markov Decision Processes. The bounds hold for the total reward and the average reward criteria. The constructions are robust in the sense that the subexponential bound holds not only on the average for independent random perturbations of the MDP parameters (transition probabilities and rewards), but for all arbitrary perturbations within an inverse polynomial range. We show also an exponential lower bound on the worst-case complexity for the simple reachability objective.

翻译：我们对古典Howard的Markov决策程序的政策迭代算法的平滑复杂程度,显示了次等低限(即$2 ⁇ Omega(n ⁇ c)$)。总奖赏和平均奖赏标准的界限维持在总奖赏和平均奖赏标准的界限上。这些构造是稳健的,因为亚等奖赏约束不仅对MDP参数(过渡概率和奖赏)的独立随机扰动具有平均价值,而且对反多边范围内的所有任意扰动也具有平均价值。我们也显示了最坏的复杂程度对于简单可达性目标的指数性更低。

0

相关内容

Markov

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖化vimentin促进动脉粥样硬化发生和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ti2AlC陶瓷与金属连接过程中陶瓷稳定性和钎焊机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-221在TWIST2调控下通过ARID1A和Wnt/β-catenin信号通路参与宫颈癌侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

卵巢癌休眠及复发过程中血管生成因子的表观遗传调控

国家自然科学基金

0+阅读 · 2009年12月31日

Pincer型环金属化合物小分子凝胶剂的合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Arxiv

0+阅读 · 2023年2月3日

Polynomial tractability for integration in an unweighted function space with absolutely convergent Fourier series

Arxiv

0+阅读 · 2023年2月3日

Bayesian Inference on Binary Spiking Networks Leveraging Nanoscale Device Stochasticity

Arxiv

0+阅读 · 2023年2月2日

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

Arxiv

0+阅读 · 2023年2月2日

Neural Design for Genetic Perturbation Experiments

Arxiv

0+阅读 · 2023年2月2日

The Backpropagation algorithm for a math student

Arxiv

0+阅读 · 2023年2月1日

An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models

Arxiv

0+阅读 · 2023年2月1日

Parameterized Complexity of Weighted Team Definability

Arxiv

0+阅读 · 2023年2月1日

Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity

Arxiv

0+阅读 · 2023年2月1日

W2SAT: Learning to generate SAT instances from Weighted Literal Incidence Graphs

Arxiv

0+阅读 · 2023年2月1日

VIP会员

文章信息

相关主题

Processing（编程语言）

相互独立的

相关VIP内容

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Arxiv

0+阅读 · 2023年2月3日

Polynomial tractability for integration in an unweighted function space with absolutely convergent Fourier series

Arxiv

0+阅读 · 2023年2月3日

Bayesian Inference on Binary Spiking Networks Leveraging Nanoscale Device Stochasticity

Arxiv

0+阅读 · 2023年2月2日

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

Arxiv

0+阅读 · 2023年2月2日

Neural Design for Genetic Perturbation Experiments

Arxiv

0+阅读 · 2023年2月2日

The Backpropagation algorithm for a math student

Arxiv

0+阅读 · 2023年2月1日

An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models

Arxiv

0+阅读 · 2023年2月1日

Parameterized Complexity of Weighted Team Definability

Arxiv

0+阅读 · 2023年2月1日

Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity

Arxiv

0+阅读 · 2023年2月1日

W2SAT: Learning to generate SAT instances from Weighted Literal Incidence Graphs

Arxiv

0+阅读 · 2023年2月1日

相关基金

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖化vimentin促进动脉粥样硬化发生和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ti2AlC陶瓷与金属连接过程中陶瓷稳定性和钎焊机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-221在TWIST2调控下通过ARID1A和Wnt/β-catenin信号通路参与宫颈癌侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

卵巢癌休眠及复发过程中血管生成因子的表观遗传调控

国家自然科学基金

0+阅读 · 2009年12月31日

Pincer型环金属化合物小分子凝胶剂的合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员