SPRT 以SPRT为基础、高效率、高效、最佳武器识别坑式强盗中的最佳武器识别方法</s> (SPRT-based Efficient Best Arm Identification in Stochastic Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · ARM · state-of-the-art · Analysis · 情景 ·

2023 年 3 月 4 日

SPRT-based Efficient Best Arm Identification in Stochastic Bandits

翻译：SPRT 以SPRT为基础、高效率、高效、最佳武器识别坑式强盗中的最佳武器识别方法

Arpan Mukherjee,Ali Tajer

This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The state-of-the-art algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, a novel framework is proposed, which views the BAI problem as sequential hypothesis testing, and is amenable to tractable analysis for the exponential family of bandits. Based on this framework, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests. This algorithm has three features for both settings: (1) its sample complexity is asymptotically optimal, (2) it is guaranteed to be $\delta-$PAC, and (3) it addresses the computational challenge of the state-of-the-art approaches. Specifically, these approaches, which are focused only on the Gaussian setting, require Thompson sampling from the arm that is deemed the best and a challenger arm. This paper analytically shows that identifying the challenger is computationally expensive and that the proposed algorithm circumvents it. Finally, numerical experiments are provided to support the analysis.

翻译：本文调查了固定信心环境中多武装盗匪中最好的手臂识别(BAI)问题。考虑了强盗成倍家族的总类别。强盗成倍家族最先进的算法面临计算挑战。为了减轻这些挑战,提出了一个新的框架,将BAI问题视为顺序假设测试,并便于对强盗成倍家族进行可移植分析。基于这个框架, BAI算法设计了一种能够利用罐头序列概率比测试的BAI算法。这个算法对两种环境都有三种特征:(1) 其样本复杂性是微不足道的,(2) 保证是美元=delta-$PAC, 和(3) 处理最先进方法的计算挑战。具体地说,这些方法只侧重于高斯环境,要求从被视为最佳和挑战手的手臂中采集汤普森样本。本文分析显示, 确定挑战者是计算成本昂贵的, 并且拟议的算法绕过它。最后, 提供数字实验是为了支持分析。</s>

0

相关内容

赌博机/老虎机

赌博机/老虎机

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

苹果生长根与吸收根的转录差异分析及重要功能基因鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

稳态强磁场下细胞凋亡的多基因调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicRNA107调控BACE1mRNA基因与阿尔茨海默病内质网应激病理机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪基质细胞直接重编程为神经干细胞在脊髓损伤中对巨噬细胞极化的调节作用与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

NINJ2影响动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

花后高温干旱胁迫下小麦淀粉粒表面微孔和微通道的变化及形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

图的k-限制连通度和k-限制边连通度的优化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Asymptotic Behaviors and Phase Transitions in Projected Stochastic Approximation: A Jump Diffusion Approach

Arxiv

0+阅读 · 2023年4月25日

Communication-Constrained Bandits under Additive Gaussian Noise

Arxiv

0+阅读 · 2023年4月25日

Fuzzy clustering of ordinal time series based on two novel distances with economic applications

Arxiv

0+阅读 · 2023年4月24日

Estimation problem for continuous time stochastic processes with periodically correlated increments

Arxiv

0+阅读 · 2023年4月24日

Direction Augmentation in the Evaluation of Armed Conflict Predictions

Arxiv

0+阅读 · 2023年4月24日

Local Energy Distribution Based Hyperparameter Determination for Stochastic Simulated Annealing

Arxiv

0+阅读 · 2023年4月24日

Demonstration Informed Specification Search

Arxiv

0+阅读 · 2023年4月24日

Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

Arxiv

0+阅读 · 2023年4月21日

MAC, a novel stochastic optimization method

Arxiv

0+阅读 · 2023年4月14日

VpROM: A novel Variational AutoEncoder-boosted Reduced Order Model for the treatment of parametric dependencies in nonlinear systems

Arxiv

0+阅读 · 2023年4月11日

VIP会员

文章信息

相关主题

赌博机/老虎机

state-of-the-art

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】用于可扩展持续强化学习的自组合策略

上交大2025《“人工智能+”行业发展蓝皮书》，137页pdf

大小模型端云协同进化技术进展

【新书】大语言模型如何工作？200页pdf

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Asymptotic Behaviors and Phase Transitions in Projected Stochastic Approximation: A Jump Diffusion Approach

Arxiv

0+阅读 · 2023年4月25日

Communication-Constrained Bandits under Additive Gaussian Noise

Arxiv

0+阅读 · 2023年4月25日

Fuzzy clustering of ordinal time series based on two novel distances with economic applications

Arxiv

0+阅读 · 2023年4月24日

Estimation problem for continuous time stochastic processes with periodically correlated increments

Arxiv

0+阅读 · 2023年4月24日

Direction Augmentation in the Evaluation of Armed Conflict Predictions

Arxiv

0+阅读 · 2023年4月24日

Local Energy Distribution Based Hyperparameter Determination for Stochastic Simulated Annealing

Arxiv

0+阅读 · 2023年4月24日

Demonstration Informed Specification Search

Arxiv

0+阅读 · 2023年4月24日

Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

Arxiv

0+阅读 · 2023年4月21日

MAC, a novel stochastic optimization method

Arxiv

0+阅读 · 2023年4月14日

VpROM: A novel Variational AutoEncoder-boosted Reduced Order Model for the treatment of parametric dependencies in nonlinear systems

Arxiv

0+阅读 · 2023年4月11日

相关基金

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

苹果生长根与吸收根的转录差异分析及重要功能基因鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

稳态强磁场下细胞凋亡的多基因调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicRNA107调控BACE1mRNA基因与阿尔茨海默病内质网应激病理机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪基质细胞直接重编程为神经干细胞在脊髓损伤中对巨噬细胞极化的调节作用与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

NINJ2影响动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

花后高温干旱胁迫下小麦淀粉粒表面微孔和微通道的变化及形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

图的k-限制连通度和k-限制边连通度的优化研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员