内幕性勒索土匪问题示范选择 (Model Selection in Contextual Stochastic Bandit Problems) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 模型选择 · 基 · MoDELS · 优化器 ·

2022 年 12 月 4 日

Model Selection in Contextual Stochastic Bandit Problems

翻译：内幕性勒索土匪问题示范选择

Aldo Pacchiano,My Phan,Yasin Abbasi-Yadkori,Anup Rao,Julian Zimmert,Tor Lattimore,Csaba Szepesvari

from arxiv, 33 main pages, 15 appendix pages

We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial meta-algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that permits us to obtain optimal $O(\sqrt{T})$ model selection guarantees for stochastic contextual bandit problems as long as the optimal base algorithm satisfies a high probability regret guarantee. We show through a lower bound that even when one of the base algorithms has $O(\log T)$ regret, in general it is impossible to get better than $\Omega(\sqrt{T})$ regret in model selection, even asymptotically. Using our techniques, we address model selection in a variety of problems such as misspecified linear contextual bandits, linear bandit with unknown dimension and reinforcement learning with unknown feature maps. Our algorithm requires the knowledge of the optimal base regret to adjust the meta-algorithm learning rate. We show that without such prior knowledge any meta-algorithm can suffer a regret larger than the optimal base regret.

翻译：我们在随机环境中研究土匪模式选择。我们的方法依赖于在候选基本算法之间选择的元值算法。我们开发了一种元- 等- 等- 等- 等- 等- 等- 等( 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等。我们开发了一个元- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等( 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等- 等等- 等- 等- 等- 等-

0

相关内容

赌博机/老虎机

赌博机/老虎机

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

ST2蛋白抑制胃癌腹膜转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于概率的名词性属性距离度量研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt及Notch信号通路在锶抑制骨吸收中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

干细胞转录因子Oct4/Nanog激活Stat3信号促进肝癌侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥ANAC060抑制果糖特异信号分子机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

捻转血矛线虫感染诱导宿主T淋巴细胞基因表达谱的差异分析

国家自然科学基金

0+阅读 · 2009年12月31日

西双版纳地区果蝠与植物的协同进化

国家自然科学基金

0+阅读 · 2008年12月31日

Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function

Arxiv

0+阅读 · 2023年2月6日

Transformers as Algorithms: Generalization and Stability in In-context Learning

Arxiv

0+阅读 · 2023年2月6日

Contact problems in porous media

Arxiv

0+阅读 · 2023年2月6日

Offline Learning in Markov Games with General Function Approximation

Arxiv

0+阅读 · 2023年2月6日

Modular Model-Based Bayesian Learning for Uncertainty-Aware and Reliable Deep MIMO Receivers

Arxiv

0+阅读 · 2023年2月5日

Trade-off between prediction and FDR for high-dimensional Gaussian model selection

Trade-off between prediction and FDR for high-dimensional Gaussian model selection

Arxiv

0+阅读 · 2023年2月3日

Learning with Exposure Constraints in Recommendation Systems

Arxiv

0+阅读 · 2023年2月2日

Constrained Online Two-stage Stochastic Optimization: New Algorithms via Adversarial Learning

Arxiv

0+阅读 · 2023年2月2日

Stochastic Contextual Bandits with Long Horizon Rewards

Arxiv

0+阅读 · 2023年2月2日

Sample Complexity of Kernel-Based Q-Learning

Arxiv

0+阅读 · 2023年2月1日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《美军条令文件：频谱管理操作技术》2025最新100页

《杀伤力效能评估：海军陆战队现役与未来战斗步枪对比模型》2025最新103页

《AI作战：将人机协作集成至实时、虚拟与建构环境（LVC）的建模与仿真》

《核指挥、控制与通信体系（NC3）：战略预警、决策支持与自适应瞄准子系统导论》2025最新报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function

Arxiv

0+阅读 · 2023年2月6日

Transformers as Algorithms: Generalization and Stability in In-context Learning

Arxiv

0+阅读 · 2023年2月6日

Contact problems in porous media

Arxiv

0+阅读 · 2023年2月6日

Offline Learning in Markov Games with General Function Approximation

Arxiv

0+阅读 · 2023年2月6日

Modular Model-Based Bayesian Learning for Uncertainty-Aware and Reliable Deep MIMO Receivers

Arxiv

0+阅读 · 2023年2月5日

Trade-off between prediction and FDR for high-dimensional Gaussian model selection

Trade-off between prediction and FDR for high-dimensional Gaussian model selection

Arxiv

0+阅读 · 2023年2月3日

Learning with Exposure Constraints in Recommendation Systems

Arxiv

0+阅读 · 2023年2月2日

Constrained Online Two-stage Stochastic Optimization: New Algorithms via Adversarial Learning

Arxiv

0+阅读 · 2023年2月2日

Stochastic Contextual Bandits with Long Horizon Rewards

Arxiv

0+阅读 · 2023年2月2日

Sample Complexity of Kernel-Based Q-Learning

Arxiv

0+阅读 · 2023年2月1日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

ST2蛋白抑制胃癌腹膜转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于概率的名词性属性距离度量研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt及Notch信号通路在锶抑制骨吸收中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

干细胞转录因子Oct4/Nanog激活Stat3信号促进肝癌侵袭转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥ANAC060抑制果糖特异信号分子机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

捻转血矛线虫感染诱导宿主T淋巴细胞基因表达谱的差异分析

国家自然科学基金

0+阅读 · 2009年12月31日

西双版纳地区果蝠与植物的协同进化

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员