缺乏依据的线性 MDP 统计估计:工具变量方法 (Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach) - 专知论文

会员服务 ·

0

估计/估计量 · 统计量 · Analysis · 线性的 · Markov ·

2022 年 9 月 12 日

Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach

翻译：缺乏依据的线性 MDP 统计估计:工具变量方法

Miao Lu,Wenhao Yang,Liangyu Zhang,Zhihua Zhang

In an Markov decision process (MDP), unobservable confounders may exist and have impacts on the data generating process, so that the classic off-policy evaluation (OPE) estimators may fail to identify the true value function of the target policy. In this paper, we study the statistical properties of OPE in confounded MDPs with observable instrumental variables. Specifically, we propose a two-stage estimator based on the instrumental variables and establish its statistical properties in the confounded MDPs with a linear structure. For non-asymptotic analysis, we prove a $\mathcal{O}(n^{-1/2})$-error bound where $n$ is the number of samples. For asymptotic analysis, we prove that the two-stage estimator is asymptotically normal with a typical rate of $n^{1/2}$. To the best of our knowledge, we are the first to show such statistical results of the two-stage estimator for confounded linear MDPs via instrumental variables.

翻译：在Markov决策程序中,可能存在无法观察的混淆者,并会影响数据生成过程,因此经典的离政策评价估计者可能无法确定目标政策的真正价值功能。在本文中,我们研究了POP在与可观测的工具变量混为一体的 MDP中的统计属性。具体地说,我们建议基于工具变量的两阶段估计器,并在具有线性结构的相混MDP中确立其统计属性。对于非抽取分析,我们证明一个$\mathcal{O}(n ⁇ -1/2})$-eror绑定,其中样本数为n美元。在抽查分析中,我们证明两阶段估计器的统计特性与典型的 $n ⁇ 1/2} 的典型比率无异。我们最了解的是,我们首先展示了通过工具变量对相匹配的线性 mDP的两阶段估测器的统计结果。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】R语言统计学习，R for Statistical Learning，301页pdf

专知会员服务

30+阅读 · 2020年11月4日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

统计学习理论之父Vapnik-MIT2020报告《完全学习统计理论Statistical Theory of Learning》

统计学习理论之父Vapnik-MIT2020报告《完全学习统计理论Statistical Theory of Learning》

专知会员服务

85+阅读 · 2020年2月16日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

非线性约束全局优化的新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

原儿茶酸介导的PI3K/Akt信号通路调节T细胞分化和细胞因子表达的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

PRL-1介导HBV相关肝细胞肝癌表皮间质转化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Th22细胞干扰角质形成细胞分泌HB-EGF与其在银屑病发病中的作用探讨

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

隐孢子虫CypA对CD4+T细胞分化和功能的调控及其作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

银屑病患者骨髓干细胞差异表达基因的筛选与鉴定

国家自然科学基金

0+阅读 · 2008年12月31日

A Stability Analysis of Modified Patankar-Runge-Kutta methods for a nonlinear Production-Destruction System

Arxiv

0+阅读 · 2022年10月21日

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

Arxiv

0+阅读 · 2022年10月21日

Low-rank Panel Quantile Regression: Estimation and Inference

Arxiv

0+阅读 · 2022年10月20日

Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Arxiv

0+阅读 · 2022年10月19日

Adversarial De-confounding in Individualised Treatment Effects Estimation

Arxiv

0+阅读 · 2022年10月19日

Constrained Factor Models for High-Dimensional Matrix-Variate Time Series

Arxiv

0+阅读 · 2022年10月19日

Statistical Inference for High-Dimensional Matrix-Variate Factor Model

Arxiv

0+阅读 · 2022年10月19日

Constrained estimation of a discrete distribution with probabilistic forecast control

Arxiv

0+阅读 · 2022年10月19日

Equispaced Fourier representations for efficient Gaussian process regression from a billion data points

Arxiv

0+阅读 · 2022年10月18日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】R语言统计学习，R for Statistical Learning，301页pdf

专知会员服务

30+阅读 · 2020年11月4日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

统计学习理论之父Vapnik-MIT2020报告《完全学习统计理论Statistical Theory of Learning》

统计学习理论之父Vapnik-MIT2020报告《完全学习统计理论Statistical Theory of Learning》

专知会员服务

85+阅读 · 2020年2月16日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的战术趋势及其对兵力设计的影响》2025最新139页

【CMU博士论文】交互驱动的人体动作估计与生成

人工智能与未来战争

《联邦学习在网络安全中的应用：性能、鲁棒性与对抗性威胁》2025最新145页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

A Stability Analysis of Modified Patankar-Runge-Kutta methods for a nonlinear Production-Destruction System

Arxiv

0+阅读 · 2022年10月21日

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

Arxiv

0+阅读 · 2022年10月21日

Low-rank Panel Quantile Regression: Estimation and Inference

Arxiv

0+阅读 · 2022年10月20日

Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Arxiv

0+阅读 · 2022年10月19日

Adversarial De-confounding in Individualised Treatment Effects Estimation

Arxiv

0+阅读 · 2022年10月19日

Constrained Factor Models for High-Dimensional Matrix-Variate Time Series

Arxiv

0+阅读 · 2022年10月19日

Statistical Inference for High-Dimensional Matrix-Variate Factor Model

Arxiv

0+阅读 · 2022年10月19日

Constrained estimation of a discrete distribution with probabilistic forecast control

Arxiv

0+阅读 · 2022年10月19日

Equispaced Fourier representations for efficient Gaussian process regression from a billion data points

Arxiv

0+阅读 · 2022年10月18日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

相关基金

非线性约束全局优化的新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

原儿茶酸介导的PI3K/Akt信号通路调节T细胞分化和细胞因子表达的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

PRL-1介导HBV相关肝细胞肝癌表皮间质转化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Th22细胞干扰角质形成细胞分泌HB-EGF与其在银屑病发病中的作用探讨

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

隐孢子虫CypA对CD4+T细胞分化和功能的调控及其作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

银屑病患者骨髓干细胞差异表达基因的筛选与鉴定

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员