时钟性因果强盗 (Chronological Causal Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · INFORMS · 离散化 · 相同 · 动力系统 ·

2021 年 12 月 3 日

Chronological Causal Bandits

翻译：时钟性因果强盗

from arxiv, 10 pages, accepted at the NeurIPS 2021 workshop Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice

This paper studies an instance of the multi-armed bandit (MAB) problem, specifically where several causal MABs operate chronologically in the same dynamical system. Practically the reward distribution of each bandit is governed by the same non-trivial dependence structure, which is a dynamic causal model. Dynamic because we allow for each causal MAB to depend on the preceding MAB and in doing so are able to transfer information between agents. Our contribution, the Chronological Causal Bandit (CCB), is useful in discrete decision-making settings where the causal effects are changing across time and can be informed by earlier interventions in the same system. In this paper, we present some early findings of the CCB as demonstrated on a toy problem.

翻译：本文研究了多武装土匪(MAB)问题的一个实例,具体地说,在同一个动态系统中,几个因果土匪(MAB)按时间顺序运作。实际上,每个土匪的奖励分配都受同样的非三重依赖结构的制约,这是一个动态因果模式。动态是因为我们允许每个因果土匪(MAB)依赖先前的MAB,并在这样做时能够在代理商之间传递信息。我们的贡献,即Chronlogic Causal Bandit(CCCB),在不同的决策环境中很有用,因为在这种环境中,因果效应会随着时间的变化而变化,并且可以通过同一系统中的早期干预来了解。我们在本文件中介绍了CB的一些早期发现,如关于一个小问题所证明的那样。

0

相关内容

赌博机/老虎机

赌博机/老虎机

数据资产管理实践白皮书（5.0版）

数据资产管理实践白皮书（5.0版）

专知会员服务

56+阅读 · 2022年1月11日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

97+阅读 · 2021年8月28日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Neural Logic Analogy Learning

Arxiv

0+阅读 · 2022年2月4日

An Experimental Design Approach for Regret Minimization in Logistic Bandits

Arxiv

0+阅读 · 2022年2月4日

Graphical criteria for the identification of marginal causal effects in continuous-time survival and event-history analyses

Graphical criteria for the identification of marginal causal effects in continuous-time survival and event-history analyses

Arxiv

0+阅读 · 2022年2月4日

Deep Hierarchy in Bandits

Arxiv

0+阅读 · 2022年2月3日

Adaptive Clustering and Personalization in Multi-Agent Stochastic Linear Bandits

Arxiv

0+阅读 · 2022年2月2日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

Causal Intervention for Leveraging Popularity Bias in Recommendation

Arxiv

3+阅读 · 2021年5月13日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

8+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

数据资产管理实践白皮书（5.0版）

数据资产管理实践白皮书（5.0版）

专知会员服务

56+阅读 · 2022年1月11日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

97+阅读 · 2021年8月28日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复合人工智能决策优势：面向军事行动的人类数字孪生智能体编队与群体建模》最新文献

中文版《整合蓝绿作战域：北约空陆一体化向多域作战演进》2025最新资料

演进中的空中力量指挥控制体系

《在轨空间目标多智能体检测的制导、导航与控制》195页

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Neural Logic Analogy Learning

Arxiv

0+阅读 · 2022年2月4日

An Experimental Design Approach for Regret Minimization in Logistic Bandits

Arxiv

0+阅读 · 2022年2月4日

Graphical criteria for the identification of marginal causal effects in continuous-time survival and event-history analyses

Graphical criteria for the identification of marginal causal effects in continuous-time survival and event-history analyses

Arxiv

0+阅读 · 2022年2月4日

Deep Hierarchy in Bandits

Arxiv

0+阅读 · 2022年2月3日

Adaptive Clustering and Personalization in Multi-Agent Stochastic Linear Bandits

Arxiv

0+阅读 · 2022年2月2日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

Causal Intervention for Leveraging Popularity Bias in Recommendation

Arxiv

3+阅读 · 2021年5月13日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

8+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员