PAC 组合理论勘探快速算法 (A Fast Algorithm for PAC Combinatorial Pure Exploration) - 专知论文

会员服务 ·

0

PAC学习理论 · 样本复杂度 · FAST · 估计/估计量 · CASES ·

2021 年 12 月 8 日

A Fast Algorithm for PAC Combinatorial Pure Exploration

翻译：PAC 组合理论勘探快速算法

Noa Ben-David,Sivan Sabato

from arxiv, Full version of paper accepted to AAAI-22

We consider the problem of Combinatorial Pure Exploration (CPE), which deals with finding a combinatorial set or arms with a high reward, when the rewards of individual arms are unknown in advance and must be estimated using arm pulls. Previous algorithms for this problem, while obtaining sample complexity reductions in many cases, are highly computationally intensive, thus making them impractical even for mildly large problems. In this work, we propose a new CPE algorithm in the PAC setting, which is computationally light weight, and so can easily be applied to problems with tens of thousands of arms. This is achieved since the proposed algorithm requires a very small number of combinatorial oracle calls. The algorithm is based on successive acceptance of arms, along with elimination which is based on the combinatorial structure of the problem. We provide sample complexity guarantees for our algorithm, and demonstrate in experiments its usefulness on large problems, whereas previous algorithms are impractical to run on problems of even a few dozen arms. The code for the algorithms and experiments is provided at https://github.com/noabdavid/csale.

翻译：我们考虑的是混合纯勘探(CPE)问题,它涉及的是寻找一组武器或具有高报酬的武器,而个别武器的奖励事先不为人知,必须用手臂拉来估计。以前对这一问题的算法虽然在许多情况下获得样品复杂性的减少,但具有很高的计算密集性,因此即使轻微的较大问题也使它们不切实际。在这项工作中,我们提议在PAC设置中采用一种新的CPE算法,即计算性轻重,因此可以很容易地适用于数万件武器的问题。这是由于提议的算法需要非常少的组合手动电话而实现的。这种算法的基础是连续接受武器,同时根据问题的组合结构来消除武器。我们为我们的算法提供抽样复杂性保证,并在实验中证明它对大问题的用处,而以前的算法甚至对几十件武器的问题都不切实际操作。计算法和实验的代码在https://github.com/noabdavid/cselle上提供。

0

相关内容

PAC学习理论

PAC学习理论

PAC学习理论不关心假设选择算法，他关心的是能否从假设空间H中学习一个好的假设h。此理论不关心怎样在假设空间中寻找好的假设，只关心能不能找得到。现在我们在来看一下什么叫“好假设”？只要满足两个条件(PAC辨识条件)即可

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

专知会员服务

75+阅读 · 2020年4月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees

Arxiv

0+阅读 · 2022年2月10日

An Exact Method for Fortification Games

Arxiv

0+阅读 · 2022年2月9日

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

Arxiv

0+阅读 · 2022年2月8日

Optimal Algorithm Allocation for Single Robot Cloud Systems

Arxiv

0+阅读 · 2022年2月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

Arxiv

4+阅读 · 2018年8月17日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

PAC学习理论

样本复杂度

估计/估计量

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

专知会员服务

75+阅读 · 2020年4月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

【经典书】C++解决问题第七版，1074pdf，Problem Solving with C++

专知会员服务

77+阅读 · 2020年2月20日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees

Arxiv

0+阅读 · 2022年2月10日

An Exact Method for Fortification Games

Arxiv

0+阅读 · 2022年2月9日

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

Arxiv

0+阅读 · 2022年2月8日

Optimal Algorithm Allocation for Single Robot Cloud Systems

Arxiv

0+阅读 · 2022年2月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

Arxiv

4+阅读 · 2018年8月17日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员