Thompson 使用普通伽玛前科的线性强盗问题抽样</s> (Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 线性的 · 样本 · 方差 · 共轭先验 ·

2023 年 3 月 6 日

Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors

翻译：Thompson 使用普通伽玛前科的线性强盗问题抽样

Björn Lindenberg,Karl-Olof Lindahl

from arxiv, 27 pages, 2 figures

We consider Thompson sampling for linear bandit problems with finitely many independent arms, where rewards are sampled from normal distributions that are linearly dependent on unknown parameter vectors and with unknown variance. Specifically, with a Bayesian formulation we consider multivariate normal-gamma priors to represent environment uncertainty for all involved parameters. We show that our chosen sampling prior is a conjugate prior to the reward model and derive a Bayesian regret bound for Thompson sampling under the condition that the 5/2-moment of the variance distribution exist.

翻译：我们认为,Thompson抽样调查的线性土匪问题有为数不多的独立武器,从通常的分布中抽取奖励,通常的分布线上依赖于未知的参数矢量,且差异不明。具体地说,用一种巴伊西亚配方,我们认为多变的正常-伽玛前兆代表了所有相关参数的环境不确定性。我们表明,我们所选择的先前采样是在奖励模型之前的假象,并得出贝伊西亚人对于汤普森取样的遗憾,条件是存在5/2差异分布的时速。</s>

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

71+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

59+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

67+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

50+阅读 · 2020年12月14日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

241+阅读 · 2020年4月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

236+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

91+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

1+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数Brown运动驱动的随机微分方程随机分岔与遍历性的研究

国家自然科学基金

2+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于海浪传感技术的水上无人机自主起降控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息耦合的煤矿瓦斯爆炸灾害风险评价与预警理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米结构AgSbTe2化合物的非平衡态工艺制备、热电性能及热力学稳定性研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

Marangoni数对三维变形液滴热毛细迁移的影响：恒为准定态？

国家自然科学基金

0+阅读 · 2011年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

低杂波驱动条件下稳态超导托卡马克D型截面位形控制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Improved Stabilizer Estimation via Bell Difference Sampling

Arxiv

0+阅读 · 2023年4月27日

Sublinear Algorithms for $(1.5+ε)$-Approximate Matching

Arxiv

0+阅读 · 2023年4月26日

Generalized generalized linear models: Convex estimation and online bounds

Arxiv

0+阅读 · 2023年4月26日

Structured Shrinkage Priors

Arxiv

0+阅读 · 2023年4月26日

Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards

Arxiv

0+阅读 · 2023年4月26日

A Simplicity Bubble Problem in Formal-Theoretic Learning Systems

Arxiv

0+阅读 · 2023年4月25日

Post-processing and improved error estimates of numerical methods for evolutionary systems

Arxiv

0+阅读 · 2023年4月25日

The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium

Arxiv

0+阅读 · 2023年4月25日

Ensemble Sampling

Arxiv

0+阅读 · 2023年4月25日

Jacobi-type algorithms for homogeneous polynomial optimization on Stiefel manifolds with applications to tensor approximations

Arxiv

0+阅读 · 2023年4月25日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

71+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

59+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

67+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

50+阅读 · 2020年12月14日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

241+阅读 · 2020年4月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

236+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

91+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

热门VIP内容

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

1+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Improved Stabilizer Estimation via Bell Difference Sampling

Arxiv

0+阅读 · 2023年4月27日

Sublinear Algorithms for $(1.5+ε)$-Approximate Matching

Arxiv

0+阅读 · 2023年4月26日

Generalized generalized linear models: Convex estimation and online bounds

Arxiv

0+阅读 · 2023年4月26日

Structured Shrinkage Priors

Arxiv

0+阅读 · 2023年4月26日

Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards

Arxiv

0+阅读 · 2023年4月26日

A Simplicity Bubble Problem in Formal-Theoretic Learning Systems

Arxiv

0+阅读 · 2023年4月25日

Post-processing and improved error estimates of numerical methods for evolutionary systems

Arxiv

0+阅读 · 2023年4月25日

The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium

Arxiv

0+阅读 · 2023年4月25日

Ensemble Sampling

Arxiv

0+阅读 · 2023年4月25日

Jacobi-type algorithms for homogeneous polynomial optimization on Stiefel manifolds with applications to tensor approximations

Arxiv

0+阅读 · 2023年4月25日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数Brown运动驱动的随机微分方程随机分岔与遍历性的研究

国家自然科学基金

2+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于海浪传感技术的水上无人机自主起降控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息耦合的煤矿瓦斯爆炸灾害风险评价与预警理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米结构AgSbTe2化合物的非平衡态工艺制备、热电性能及热力学稳定性研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

Marangoni数对三维变形液滴热毛细迁移的影响：恒为准定态？

国家自然科学基金

0+阅读 · 2011年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

低杂波驱动条件下稳态超导托卡马克D型截面位形控制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员