通过预测抽样进行非静止的土匪学习 (Nonstationary Bandit Learning via Predictive Sampling) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 样本 · 回合 · 学成 · INFORMS ·

2022 年 5 月 4 日

Nonstationary Bandit Learning via Predictive Sampling

翻译：通过预测抽样进行非静止的土匪学习

Yueyang Liu,Benjamin Van Roy,Kuang Xu

We propose predictive sampling as an approach to selecting actions that balance between exploration and exploitation in nonstationary bandit environments. When specialized to stationary environments, predictive sampling is equivalent to Thompson sampling. However, predictive sampling is effective across a range of nonstationary environments in which Thompson sampling suffers. We establish a general information-theoretic bound on the Bayesian regret of predictive sampling. We then specialize this bound to study a modulated Bernoulli bandit environment. Our analysis highlights a key advantage of predictive sampling over Thompson sampling: predictive sampling deprioritizes investments in exploration where acquired information will quickly become less relevant.

翻译：我们提出预测抽样,作为选择在非静止强盗环境中进行勘探和开发之间平衡的行动的一种方法。当专门为固定环境进行预测抽样时,预测抽样相当于汤普森取样。然而,预测抽样在汤普森取样所受影响的一系列非静止环境中是有效的。我们在贝叶斯人对预测抽样的遗憾上建立了一个一般的信息理论约束。然后我们专门研究一个调制的伯努利强盗环境。我们的分析突出了预测抽样相对于汤普森取样的主要优势:预测抽样使对勘探的投资失去优先地位,而获得的信息将很快变得不太相关。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

长链非编码RNA n385229吸附miR-497对胰腺癌化疗耐药表型的调控作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

地聚合物/水化硅酸钙体系的微结构形成机理与过程调控

国家自然科学基金

0+阅读 · 2014年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

Low depth algorithms for quantum amplitude estimation

Arxiv

0+阅读 · 2022年6月22日

SS-IL: Separated Softmax for Incremental Learning

Arxiv

0+阅读 · 2022年6月21日

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Arxiv

0+阅读 · 2022年6月20日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月17日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Low depth algorithms for quantum amplitude estimation

Arxiv

0+阅读 · 2022年6月22日

SS-IL: Separated Softmax for Incremental Learning

Arxiv

0+阅读 · 2022年6月21日

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Arxiv

0+阅读 · 2022年6月20日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月17日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

相关基金

长链非编码RNA n385229吸附miR-497对胰腺癌化疗耐药表型的调控作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

地聚合物/水化硅酸钙体系的微结构形成机理与过程调控

国家自然科学基金

0+阅读 · 2014年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员