Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits - 专知论文

会员服务 ·

0

规范化的 · 赌博机/老虎机 · 线性的 · 优化器 · CASES ·

2023 年 5 月 19 日

Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits

翻译：暂无翻译

Yuriy Dorn,Nikita Kornilov,Nikolay Kutuzov,Alexander Nazin,Eduard Gorbunov,Alexander Gasnikov

The Implicitly Normalized Forecaster (INF) algorithm is considered to be an optimal solution for adversarial multi-armed bandit (MAB) problems. However, most of the existing complexity results for INF rely on restrictive assumptions, such as bounded rewards. Recently, a related algorithm was proposed that works for both adversarial and stochastic heavy-tailed MAB settings. However, this algorithm fails to fully exploit the available data. In this paper, we propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INF-clip) for MAB problems with heavy-tailed reward distributions. We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones. Furthermore, we show that INF-clip outperforms the best-of-both-worlds algorithm in cases where it is difficult to distinguish between different arms.

翻译：暂无翻译

0

相关内容

规范化的

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

重磅开讲：图灵奖得主—— Joseph Sifakis

重磅开讲：图灵奖得主—— Joseph Sifakis

THU数据派

0+阅读 · 2022年6月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于IVIF-VIKOR的地质灾害应急动态决策方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

复杂医疗保健数据的统计推断和过程控制

国家自然科学基金

1+阅读 · 2013年12月31日

三维调制大孔黑硅及其在太阳能转换中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

重稀土元素对铁基块体非晶合金过冷液相热稳定性及其磁学性能影响机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

甘肃金鳟生长性状候选基因的关联分析及功能标记开发

国家自然科学基金

0+阅读 · 2011年12月31日

Sn基无铅焊料电迁移的各向异性研究

国家自然科学基金

0+阅读 · 2011年12月31日

设计合成用于白光LED的全色单一稀土配位聚合物

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下光子带隙可调的光子晶体的组装及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

Role Engine Implementation for a Continuous and Collaborative Multi-Robot System

Role Engine Implementation for a Continuous and Collaborative Multi-Robot System

Arxiv

0+阅读 · 2023年7月6日

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

Arxiv

0+阅读 · 2023年7月6日

UAV Swarms for Joint Data Ferrying and Dynamic Cell Coverage via Optimal Transport Descent and Quadratic Assignment

Arxiv

0+阅读 · 2023年7月6日

Estimation and Inference of Extremal Quantile Treatment Effects for Heavy-Tailed Distributions

Arxiv

0+阅读 · 2023年7月5日

Strong convergence rates for a full discretization of stochastic wave equation with nonlinear damping

Arxiv

0+阅读 · 2023年7月5日

Minimizing Age of Information for Mobile Edge Computing Systems: A Nested Index Approach

Arxiv

0+阅读 · 2023年7月3日

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Arxiv

0+阅读 · 2023年7月3日

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations

Arxiv

0+阅读 · 2023年7月1日

Efficient Algorithms for Euclidean Steiner Minimal Tree on Near-Convex Terminal Sets

Arxiv

0+阅读 · 2023年7月1日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌战争启示：坦克战与不断演变的战斗形态

《大规模作战行动中与无人机集成的C5ISR系统》

《主观概率约束下寻找可行系统及其军事应用》69页

《美政府问责局：多种挑战影响地面战车任务出勤率》2025最新130页

相关资讯

重磅开讲：图灵奖得主—— Joseph Sifakis

重磅开讲：图灵奖得主—— Joseph Sifakis

THU数据派

0+阅读 · 2022年6月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Role Engine Implementation for a Continuous and Collaborative Multi-Robot System

Role Engine Implementation for a Continuous and Collaborative Multi-Robot System

Arxiv

0+阅读 · 2023年7月6日

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

Arxiv

0+阅读 · 2023年7月6日

UAV Swarms for Joint Data Ferrying and Dynamic Cell Coverage via Optimal Transport Descent and Quadratic Assignment

Arxiv

0+阅读 · 2023年7月6日

Estimation and Inference of Extremal Quantile Treatment Effects for Heavy-Tailed Distributions

Arxiv

0+阅读 · 2023年7月5日

Strong convergence rates for a full discretization of stochastic wave equation with nonlinear damping

Arxiv

0+阅读 · 2023年7月5日

Minimizing Age of Information for Mobile Edge Computing Systems: A Nested Index Approach

Arxiv

0+阅读 · 2023年7月3日

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Arxiv

0+阅读 · 2023年7月3日

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations

Arxiv

0+阅读 · 2023年7月1日

Efficient Algorithms for Euclidean Steiner Minimal Tree on Near-Convex Terminal Sets

Arxiv

0+阅读 · 2023年7月1日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

相关基金

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于IVIF-VIKOR的地质灾害应急动态决策方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

复杂医疗保健数据的统计推断和过程控制

国家自然科学基金

1+阅读 · 2013年12月31日

三维调制大孔黑硅及其在太阳能转换中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

重稀土元素对铁基块体非晶合金过冷液相热稳定性及其磁学性能影响机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

甘肃金鳟生长性状候选基因的关联分析及功能标记开发

国家自然科学基金

0+阅读 · 2011年12月31日

Sn基无铅焊料电迁移的各向异性研究

国家自然科学基金

0+阅读 · 2011年12月31日

设计合成用于白光LED的全色单一稀土配位聚合物

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下光子带隙可调的光子晶体的组装及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员