通过可预测性抽样调查进行非结构性土匪学习</s> (Non-Stationary Bandit Learning via Predictive Sampling) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 样本 · INFORMS · Performer · 回合 ·

2023 年 3 月 13 日

Non-Stationary Bandit Learning via Predictive Sampling

翻译：通过可预测性抽样调查进行非结构性土匪学习

Yueyang Liu,Benjamin Van Roy,Kuang Xu

Thompson sampling has proven effective across a wide range of stationary bandit environments. However, as we demonstrate in this paper, it can perform poorly when applied to non-stationary environments. We show that such failures are attributed to the fact that, when exploring, the algorithm does not differentiate actions based on how quickly the information acquired loses its usefulness due to non-stationarity. Building upon this insight, we propose predictive sampling, an algorithm that deprioritizes acquiring information that quickly loses usefulness. Theoretical guarantee on the performance of predictive sampling is established through a Bayesian regret bound. We provide versions of predictive sampling for which computations tractably scale to complex bandit environments of practical interest. Through numerical simulations, we demonstrate that predictive sampling outperforms Thompson sampling in all non-stationary environments examined.

翻译：汤普森取样证明在一系列的固定强盗环境中是有效的。然而,正如我们在本文中所表明的那样,当应用于非静止环境时,它的表现可能很差。我们表明,这种失败是由于以下事实造成的:在探索时,算法没有根据获得的信息由于非静止性而丧失其效用的速度来区分行动。我们提出预测性取样,这种算法使获取迅速失去用处的信息变得不优先。关于预测性取样的运行的理论保证是通过贝叶西亚的遗憾来建立的。我们提供了预测性取样的版本,可以对具有实际兴趣的复杂强盗环境进行可分量的计算。我们通过数字模拟,证明预测性取样在所有非静止环境中都优于汤普森取样。</s>

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Massive MIMO系统关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激及UPR信号通路在支气管肺发育不良中的作用及IGF-1的干预

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

卵巢癌休眠及复发过程中血管生成因子的表观遗传调控

国家自然科学基金

0+阅读 · 2009年12月31日

A Preconditioned Iterative Interior Point Approach to the Conic Bundle Subproblem

Arxiv

0+阅读 · 2023年5月2日

Indexability of Finite State Restless Multi-Armed Bandit and Rollout Policy

Arxiv

0+阅读 · 2023年4月30日

Subdata selection for big data regression: an improved approach

Arxiv

0+阅读 · 2023年4月29日

Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards

Arxiv

0+阅读 · 2023年4月28日

Differential item functioning via robust scaling

Arxiv

0+阅读 · 2023年4月28日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

《北约认知战概念报告》

【MIT博士论文】高效的视觉合成生成模型

美海军放弃星座级转而采用国家安全巡逻舰设计

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Preconditioned Iterative Interior Point Approach to the Conic Bundle Subproblem

Arxiv

0+阅读 · 2023年5月2日

Indexability of Finite State Restless Multi-Armed Bandit and Rollout Policy

Arxiv

0+阅读 · 2023年4月30日

Subdata selection for big data regression: an improved approach

Arxiv

0+阅读 · 2023年4月29日

Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards

Arxiv

0+阅读 · 2023年4月28日

Differential item functioning via robust scaling

Arxiv

0+阅读 · 2023年4月28日

相关基金

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Massive MIMO系统关键技术的研究

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激及UPR信号通路在支气管肺发育不良中的作用及IGF-1的干预

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

卵巢癌休眠及复发过程中血管生成因子的表观遗传调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员