具有线性Thompson抽样应用的一般分发的椭圆潜在 Lemma Lemma (The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling) - 专知论文

会员服务 ·

0

协方差矩阵 · 线性的 · 赌博机/老虎机 · Extensibility · 噪声分布 ·

2022 年 1 月 19 日

The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling

翻译：具有线性Thompson抽样应用的一般分发的椭圆潜在 Lemma Lemma

Nima Hamidi,Mohsen Bayati

from arxiv, Accepted to Operations Research

In this note, we introduce a general version of the well-known elliptical potential lemma that is a widely used technique in the analysis of algorithms in sequential learning and decision-making problems. We consider a stochastic linear bandit setting where a decision-maker sequentially chooses among a set of given actions, observes their noisy rewards, and aims to maximize her cumulative expected reward over a decision-making horizon. The elliptical potential lemma is a key tool for quantifying uncertainty in estimating parameters of the reward function, but it requires the noise and the prior distributions to be Gaussian. Our general elliptical potential lemma relaxes this Gaussian requirement which is a highly non-trivial extension for a number of reasons; unlike the Gaussian case, there is no closed-form solution for the covariance matrix of the posterior distribution, the covariance matrix is not a deterministic function of the actions, and the covariance matrix is not decreasing with respect to the semidefinite inequality. While this result is of broad interest, we showcase an application of it to prove an improved Bayesian regret bound for the well-known Thompson sampling algorithm in stochastic linear bandits with changing action sets where prior and noise distributions are general. This bound is minimax optimal up to constants.

翻译：在本说明中,我们引入了众所周知的椭圆潜力的普通版本,这是分析连续学习和决策问题的算法时广泛使用的一种技术。我们考虑一种随机的线性线性匪帮设置,在这个设置中,决策者按顺序选择一系列特定行动,观察其吵闹的奖励,目的是在决策视野中最大限度地增加其累积的预期奖赏。椭圆潜力是估算奖励功能参数时量化不确定性的关键工具,但它需要噪音和先前的分布才能成为高斯。我们一般的椭圆潜力缓解了高斯的要求,这是出于一些原因,高度非三重扩展;与高斯案例不同,后方分布的变异性矩阵没有封闭式解决办法,共变异矩阵并不是行动的一种威慑性功能,而对于半定型不平等则没有减少。尽管这一结果具有广泛的兴趣,但我们展示了高端利贷的这一要求,这是高度非三重的扩展;与高端分配相比,对于后方分布的组合,共变形矩阵并不是行动的一种决定性功能,而共变式矩阵在半定不平等方面并没有减少。在这种结果中,我们展示了广泛的兴趣,我们展示了在普通的基质的基质分析中式矩阵应用了它之前的模型,从而可以证明其改进了普通的模型。

0

相关内容

协方差矩阵

协方差矩阵

在概率论和统计学中，协方差矩阵（也称为自协方差矩阵，色散矩阵，方差矩阵或方差-协方差矩阵）是平方矩阵，给出了给定随机向量的每对元素之间的协方差。在矩阵对角线中存在方差，即每个元素与其自身的协方差。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

8+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

高效3D 4H-SiC中子探测器的研究

国家自然科学基金

0+阅读 · 2013年12月31日

变系数微分方程的谱方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于约束松弛的概率图模型近似推理研究及在计算摄像学中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

高维数据的几何结构分析

国家自然科学基金

3+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

基于复值ICA和张量分解的完备fMRI数据分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

超过程及相关SPDE的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Discontinuous Galerkin methods for stochastic Maxwell equations with multiplicative noise

Discontinuous Galerkin methods for stochastic Maxwell equations with multiplicative noise

Arxiv

0+阅读 · 2022年4月20日

Graph-theoretic algorithms for Kolmogorov operators: Approximating solutions and their gradients in elliptic and parabolic problems on manifolds

Arxiv

0+阅读 · 2022年4月19日

Sampling Lovász Local Lemma For General Constraint Satisfaction Solutions In Near-Linear Time

Arxiv

0+阅读 · 2022年4月19日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Algorithmizing the Multiplicity Schwartz-Zippel Lemma

Arxiv

0+阅读 · 2022年4月18日

A fast linear system solution with application to spatial source separation for the Cosmic Microwave Background

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Arxiv

0+阅读 · 2022年4月15日

Statistical-Computational Trade-offs in Tensor PCA and Related Problems via Communication Complexity

Arxiv

0+阅读 · 2022年4月15日

VIP会员

文章信息

相关主题

协方差矩阵

赌博机/老虎机

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《具备集体态势感知能力的深度强化学习智能体在超视距空战中的应用研究》最新文献

《美军条令文件：频谱管理操作技术》2025最新100页

反制小型无人机：一项重大挑战

《AI作战：将人机协作集成至实时、虚拟与建构环境（LVC）的建模与仿真》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Discontinuous Galerkin methods for stochastic Maxwell equations with multiplicative noise

Discontinuous Galerkin methods for stochastic Maxwell equations with multiplicative noise

Arxiv

0+阅读 · 2022年4月20日

Graph-theoretic algorithms for Kolmogorov operators: Approximating solutions and their gradients in elliptic and parabolic problems on manifolds

Arxiv

0+阅读 · 2022年4月19日

Sampling Lovász Local Lemma For General Constraint Satisfaction Solutions In Near-Linear Time

Arxiv

0+阅读 · 2022年4月19日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Algorithmizing the Multiplicity Schwartz-Zippel Lemma

Arxiv

0+阅读 · 2022年4月18日

A fast linear system solution with application to spatial source separation for the Cosmic Microwave Background

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Arxiv

0+阅读 · 2022年4月15日

Statistical-Computational Trade-offs in Tensor PCA and Related Problems via Communication Complexity

Arxiv

0+阅读 · 2022年4月15日

相关基金

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

8+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

高效3D 4H-SiC中子探测器的研究

国家自然科学基金

0+阅读 · 2013年12月31日

变系数微分方程的谱方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于约束松弛的概率图模型近似推理研究及在计算摄像学中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

高维数据的几何结构分析

国家自然科学基金

3+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

基于复值ICA和张量分解的完备fMRI数据分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

超过程及相关SPDE的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员