分发的Thompson抽样 (Distributed Thompson Sampling) - 专知论文

会员服务 ·

0

可约的 · 样本 · 赌博机/老虎机 · Performer · 学成 ·

2020 年 12 月 3 日

Distributed Thompson Sampling

翻译：分发的Thompson抽样

Jing Dong,Tan Li,Shaolei Ren,Linqi Song

We study a cooperative multi-agent multi-armed bandits with M agents and K arms. The goal of the agents is to minimized the cumulative regret. We adapt a traditional Thompson Sampling algoirthm under the distributed setting. However, with agent's ability to communicate, we note that communication may further reduce the upper bound of the regret for a distributed Thompson Sampling approach. To further improve the performance of distributed Thompson Sampling, we propose a distributed Elimination based Thompson Sampling algorithm that allow the agents to learn collaboratively. We analyse the algorithm under Bernoulli reward and derived a problem dependent upper bound on the cumulative regret.

翻译：我们用M剂和K武器研究一个多剂多武装合作匪徒,目的是尽量减少累积的遗憾。我们根据分布式环境对传统的Thompson取样器进行修改。然而,由于代理人的通讯能力,我们注意到,通信可能会进一步减少分发Thompson取样法的遗憾上限。为了进一步改善分发的Thompson取样法的绩效,我们提议一个分布式的基于Exprise Thompson取样法,使代理人能够合作学习。我们分析了Bernoulli奖项下的算法,并得出了一个取决于累积遗憾的问题。

0

相关内容

可约的

MIT科学家Dimitri P. Bertsekas最新《强化学习与最优控制》2021ASU课程，(附书稿PDF&讲义)

MIT科学家Dimitri P. Bertsekas最新《强化学习与最优控制》2021ASU课程，(附书稿PDF&讲义)

专知会员服务

91+阅读 · 2021年1月17日

【ICML2020Tutorial】机器学习信号处理，100页ppt

【ICML2020Tutorial】机器学习信号处理，100页ppt

专知会员服务

113+阅读 · 2020年8月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

56+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Distribution is all you need：这里有12种做ML不可不知的分布

Distribution is all you need：这里有12种做ML不可不知的分布

机器之心

3+阅读 · 2019年9月21日

腊月廿八 | 强化学习-TRPO和PPO背后的数学

腊月廿八 | 强化学习-TRPO和PPO背后的数学

AI研习社

18+阅读 · 2019年2月2日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Fully Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks

Arxiv

0+阅读 · 2021年1月22日

The Complexity of the Distributed Constraint Satisfaction Problem

Arxiv

0+阅读 · 2021年1月22日

Linear Regression with Distributed Learning: A Generalization Error Perspective

Arxiv

0+阅读 · 2021年1月22日

Rank Reduction in Bimatrix Games

Arxiv

0+阅读 · 2021年1月21日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

Arxiv

4+阅读 · 2018年10月24日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

MIT科学家Dimitri P. Bertsekas最新《强化学习与最优控制》2021ASU课程，(附书稿PDF&讲义)

MIT科学家Dimitri P. Bertsekas最新《强化学习与最优控制》2021ASU课程，(附书稿PDF&讲义)

专知会员服务

91+阅读 · 2021年1月17日

【ICML2020Tutorial】机器学习信号处理，100页ppt

【ICML2020Tutorial】机器学习信号处理，100页ppt

专知会员服务

113+阅读 · 2020年8月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

56+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】深度神经网络的参数高效推理与训练

人工智能：实时战斗适应

【NeurIPS2025】MIDAS：一种基于错配的用于失衡多模态学习的数据增强策略

从感知到认知：多模态大语言模型中视觉-语言交互推理综述

相关资讯

Distribution is all you need：这里有12种做ML不可不知的分布

Distribution is all you need：这里有12种做ML不可不知的分布

机器之心

3+阅读 · 2019年9月21日

腊月廿八 | 强化学习-TRPO和PPO背后的数学

腊月廿八 | 强化学习-TRPO和PPO背后的数学

AI研习社

18+阅读 · 2019年2月2日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Fully Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks

Arxiv

0+阅读 · 2021年1月22日

The Complexity of the Distributed Constraint Satisfaction Problem

Arxiv

0+阅读 · 2021年1月22日

Linear Regression with Distributed Learning: A Generalization Error Perspective

Arxiv

0+阅读 · 2021年1月22日

Rank Reduction in Bimatrix Games

Arxiv

0+阅读 · 2021年1月21日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

Arxiv

4+阅读 · 2018年10月24日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员