探索在斯托查奇盗贼中用武器分数从最高一美元中抽出1美元 (Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · PAC学习理论 · 概率近似正确 · 优化器 · 可辨认的 ·

2020 年 11 月 19 日

Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits

翻译：探索在斯托查奇盗贼中用武器分数从最高一美元中抽出1美元

Wenbo Ren,Jia Liu,Ness Shroff

This paper studies the problem of identifying any $k$ distinct arms among the top $\rho$ fraction (e.g., top 5\%) of arms from a finite or infinite set with a probably approximately correct (PAC) tolerance $\epsilon$. We consider two cases: (i) when the threshold of the top arms' expected rewards is known and (ii) when it is unknown. We prove lower bounds for the four variants (finite or infinite arms, and known or unknown threshold), and propose algorithms for each. Two of these algorithms are shown to be sample complexity optimal (up to constant factors) and the other two are optimal up to a log factor. Results in this paper provide up to $\rho n/k$ reductions compared with the "$k$-exploration" algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms. We also numerically show improvements over the state-of-the-art.

翻译：本文研究从一定或无限的、大概大致正确(PAC)的容积$\ epsilon$(e)中确定武器顶部部分(例如,顶部5 ⁇ )中任何一股美元不同的武器(例如,顶部5 ⁇ )的问题。我们考虑了两个案例:(一) 当知道顶层武器预期奖赏的门槛时,和(二) 当它不为人知时,我们证明四个变种(无限或无限武器,以及已知或未知的门槛)的界限较低,并提出了每一种的算法。其中两种算法被证明为最优化的样本复杂度(最高为常数),而其他两种算法则最优于一个日志系数。本文的结果提供了最高为$rho n/k$的削减额,而“美元-勘探”的算法则侧重于从一元武器中找到最佳的(PAC)一美元。我们还用数字展示了对“状态”的改进。

0

相关内容

赌博机/老虎机

赌博机/老虎机

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

专知会员服务

92+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

已删除

将门创投

5+阅读 · 2019年4月29日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Parameterized Approximation Algorithms for Bidirected Steiner Network Problems

Arxiv

0+阅读 · 2021年1月13日

The Influence of Shape Constraints on the Thresholding Bandit Problem

Arxiv

0+阅读 · 2021年1月12日

Linear Aggregation in Tree-based Estimators

Arxiv

0+阅读 · 2021年1月11日

On the power of standard information for tractability for $L_2$-approximation in the randomized setting

Arxiv

0+阅读 · 2021年1月11日

Provably Approximated ICP

Arxiv

0+阅读 · 2021年1月10日

Analysis of Stochastic Gradient Descent in Continuous Time

Arxiv

0+阅读 · 2021年1月10日

Approximately Strategyproof Tournament Rules in the Probabilistic Setting

Arxiv

0+阅读 · 2021年1月10日

Quantization optimized with respect to the Haar basis

Arxiv

0+阅读 · 2021年1月9日

Inference for Batched Bandits

Arxiv

0+阅读 · 2021年1月8日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

赌博机/老虎机

PAC学习理论

概率近似正确

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

专知会员服务

92+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

已删除

将门创投

5+阅读 · 2019年4月29日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Parameterized Approximation Algorithms for Bidirected Steiner Network Problems

Arxiv

0+阅读 · 2021年1月13日

The Influence of Shape Constraints on the Thresholding Bandit Problem

Arxiv

0+阅读 · 2021年1月12日

Linear Aggregation in Tree-based Estimators

Arxiv

0+阅读 · 2021年1月11日

On the power of standard information for tractability for $L_2$-approximation in the randomized setting

Arxiv

0+阅读 · 2021年1月11日

Provably Approximated ICP

Arxiv

0+阅读 · 2021年1月10日

Analysis of Stochastic Gradient Descent in Continuous Time

Arxiv

0+阅读 · 2021年1月10日

Approximately Strategyproof Tournament Rules in the Probabilistic Setting

Arxiv

0+阅读 · 2021年1月10日

Quantization optimized with respect to the Haar basis

Arxiv

0+阅读 · 2021年1月9日

Inference for Batched Bandits

Arxiv

0+阅读 · 2021年1月8日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员