带有匿名综合综合匿名延迟反馈的有声有声的记忆反反弹强盗 (Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 情景 · 损失 · 最优化 · ENJOY ·

2022 年 4 月 27 日

Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback

翻译：带有匿名综合综合匿名延迟反馈的有声有声的记忆反反弹强盗

Zongqi Wan,Xiaoming Sun,Jialin Zhang

from arxiv, IJCAI'2022

We study the adversarial bandit problem with composite anonymous delayed feedback. In this setting, losses of an action are split into $d$ components, spreading over consecutive rounds after the action is chosen. And in each round, the algorithm observes the aggregation of losses that come from the latest $d$ rounds. Previous works focus on oblivious adversarial setting, while we investigate the harder non-oblivious setting. We show non-oblivious setting incurs $\Omega(T)$ pseudo regret even when the loss sequence is bounded memory. However, we propose a wrapper algorithm which enjoys $o(T)$ policy regret on many adversarial bandit problems with the assumption that the loss sequence is bounded memory. Especially, for $K$-armed bandit and bandit convex optimization, we have $\mathcal{O}(T^{2/3})$ policy regret bound. We also prove a matching lower bound for $K$-armed bandit. Our lower bound works even when the loss sequence is oblivious but the delay is non-oblivious. It answers the open problem proposed in \cite{wang2021adaptive}, showing that non-oblivious delay is enough to incur $\tilde{\Omega}(T^{2/3})$ regret.

翻译：我们用复合匿名延迟反馈来研究对抗性土匪问题。在这种环境下, 行动的损失被分割成美元的组成部分, 在选择行动后连续几轮。在每轮中, 算法观察最新的美元回合产生的损失汇总情况。先前的工作重点是模糊的对抗环境, 而我们调查较难的非显眼环境。我们显示非显眼的设置导致$\ Omega( T) 的伪遗憾, 即使损失序列与内存有关。但是, 我们提出一个包装算法, 在许多对抗性土匪问题上享有$( T) 的政策遗憾, 并在假设损失序列是约束性记忆的情况下, 。特别是, $( K) 武装的土匪和土匪的 convex 优化, 我们有$\ mathcal{O} (T+2/3} 政策遗憾。我们还证明, $( T) 和$( K) 手持土匪的比下低。我们较低的约束算得更低。即使在损失序列为模糊, 但是延迟也是不明显的。它解了一个公开的问题, 显示的是, $2_\\\\\\\\\\\\\\\\\\\\}

0

相关内容

赌博机/老虎机

赌博机/老虎机

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

对称性破缺条件下耦合系统chimera态的特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

网关口令认证密钥交换协议的模型与设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sonic hedgehog信号通路促进卵巢癌转移机制研究及靶向治疗

国家自然科学基金

0+阅读 · 2011年12月31日

口令认证密钥交换协议的可证明安全性研究

国家自然科学基金

1+阅读 · 2008年12月31日

肿瘤靶向的纳米硒颗粒的抑瘤机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Scale-free Unconstrained Online Learning for Curved Losses

Arxiv

0+阅读 · 2022年6月15日

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年6月13日

Towards an Approximation-Aware Computational Workflow Framework for Accelerating Large-Scale Discovery Tasks

Arxiv

0+阅读 · 2022年6月13日

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Arxiv

0+阅读 · 2022年6月13日

The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity

Arxiv

0+阅读 · 2022年6月12日

Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

Arxiv

0+阅读 · 2022年6月11日

Learning Classifiers under Delayed Feedback with a Time Window Assumption

Arxiv

0+阅读 · 2022年6月10日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Scale-free Unconstrained Online Learning for Curved Losses

Arxiv

0+阅读 · 2022年6月15日

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年6月13日

Towards an Approximation-Aware Computational Workflow Framework for Accelerating Large-Scale Discovery Tasks

Arxiv

0+阅读 · 2022年6月13日

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Arxiv

0+阅读 · 2022年6月13日

The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity

Arxiv

0+阅读 · 2022年6月12日

Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

Arxiv

0+阅读 · 2022年6月11日

Learning Classifiers under Delayed Feedback with a Time Window Assumption

Arxiv

0+阅读 · 2022年6月10日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

对称性破缺条件下耦合系统chimera态的特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

网关口令认证密钥交换协议的模型与设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sonic hedgehog信号通路促进卵巢癌转移机制研究及靶向治疗

国家自然科学基金

0+阅读 · 2011年12月31日

口令认证密钥交换协议的可证明安全性研究

国家自然科学基金

1+阅读 · 2008年12月31日

肿瘤靶向的纳米硒颗粒的抑瘤机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员