CAR-DESPOT: 被因果知识驱动的用于面向机器人在混淆环境下的在线POMDP规划 (CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments) - 专知论文

会员服务 ·

0

知识驱动 · 因果模型 · 在线 · 机器人 · 部分可观测马尔可夫决策过程 ·

2023 年 4 月 13 日

CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

翻译：CAR-DESPOT: 被因果知识驱动的用于面向机器人在混淆环境下的在线POMDP规划

Ricardo Cannizzaro,Lars Kunze

from arxiv, 8 pages, 3 figures, submitted to 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Robots operating in real-world environments must reason about possible outcomes of stochastic actions and make decisions based on partial observations of the true world state. A major challenge for making accurate and robust action predictions is the problem of confounding, which if left untreated can lead to prediction errors. The partially observable Markov decision process (POMDP) is a widely-used framework to model these stochastic and partially-observable decision-making problems. However, due to a lack of explicit causal semantics, POMDP planning methods are prone to confounding bias and thus in the presence of unobserved confounders may produce underperforming policies. This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT), a modern anytime online POMDP planner, using causal modelling and inference to eliminate errors caused by unmeasured confounder variables. We further propose a method to learn offline the partial parameterisation of the causal model for planning, from ground truth model data. We evaluate our methods on a toy problem with an unobserved confounder and show that the learned causal model is highly accurate, while our planning method is more robust to confounding and produces overall higher performing policies than AR-DESPOT.

翻译：机器人在现实环境中运作，必须对随机动作的可能结果进行推论，并根据部分观察得出的真实环境状态作出决策。制定准确和健壮的动作预测的主要挑战是混淆问题，如果不加处理，将导致预测错误。部分可观测马尔可夫决策过程（POMDP）是一种广泛使用的框架，用于模拟这些随机和部分可观测的决策问题。然而，由于缺乏明确的因果语义， POMDP规划方法容易出现混淆偏差，因此在存在未观察到的混淆因素的情况下可能会产生低效的策略。本文提出了一种新颖的因果知识扩展的 "任意时间正则化确定化稀疏部分可观测树"(AR-DESPOT)，这是一种现代化的任意时间在线POMDP规划器，利用因果建模和推理消除因未测量的混淆变量而引起的误差。我们进一步提出了一种方法，从基本的真实模型数据中学习用于规划的因果模型的部分参数化。我们在一个包含未观测到混淆因素的玩具问题上对方法进行了评估，并显示学习的因果模型非常准确，而我们的规划方法更加强大，能够产生整体表现更好的策略，而不受混淆的影响。

0

相关内容

知识驱动

【硬核书】规划算法 (Planning Algorithm)，1023页pdf，Steven M. Illinois大学

【硬核书】规划算法 (Planning Algorithm)，1023页pdf，Steven M. Illinois大学

专知会员服务

165+阅读 · 2022年4月10日

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

专知会员服务

24+阅读 · 2022年3月10日

维多利亚运输政策研究所“Autonomous Vehicle Implementation Predictions：Implications for Transport Planning”（自动驾驶汽车实施预测:对交通规划的影响）

维多利亚运输政策研究所“Autonomous Vehicle Implementation Predictions：Implications for Transport Planning”（自动驾驶汽车实施预测:对交通规划的影响）

专知会员服务

17+阅读 · 2022年2月16日

【普林斯顿干货书】强化学习与随机优化，728页pdf阐述序列决策统一框架

【普林斯顿干货书】强化学习与随机优化，728页pdf阐述序列决策统一框架

专知会员服务

129+阅读 · 2021年4月25日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知会员服务

87+阅读 · 2020年8月28日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

专知会员服务

30+阅读 · 2019年12月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

253页PPT！《因果性Causality》教程，哥本哈根大学Jonas Peters讲授

253页PPT！《因果性Causality》教程，哥本哈根大学Jonas Peters讲授

专知

3+阅读 · 2022年7月7日

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

专知

18+阅读 · 2022年4月9日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【泡泡一分钟】神经SLAM：使用外部存储器让智能体学习探索环境

【泡泡一分钟】神经SLAM：使用外部存储器让智能体学习探索环境

泡泡机器人SLAM

12+阅读 · 2018年4月17日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

面向复杂生产系统设计与控制的仿真优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

空间语义地图机器人自主在线构建方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向航空发动机受限问题的切换系统自适应控制

国家自然科学基金

0+阅读 · 2013年12月31日

复杂未建模系统的基于随机逼近的数据驱动控制研究

国家自然科学基金

2+阅读 · 2013年12月31日

Epimorphin调控的miR-107在肝癌侵袭和转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

外部干扰下自主式水下机器人推进器与导航传感器故障诊断方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

未校准环境下机器人自适应手眼视觉跟踪研究

国家自然科学基金

1+阅读 · 2012年12月31日

气化炉控制系统设计的随机方法

国家自然科学基金

0+阅读 · 2011年12月31日

基于轨迹灵敏度的模型预测紧急电压控制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Causal Estimation of User Learning in Personalized Systems

Arxiv

0+阅读 · 2023年6月1日

A Bayesian analysis of the time through the order penalty in baseball

Arxiv

0+阅读 · 2023年5月31日

Regulated Pure Pursuit for Robot Path Tracking

Regulated Pure Pursuit for Robot Path Tracking

Arxiv

0+阅读 · 2023年5月31日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年5月31日

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

Arxiv

0+阅读 · 2023年5月31日

XInsight: eXplainable Data Analysis Through The Lens of Causality

Arxiv

0+阅读 · 2023年5月30日

Backdoor Attacks Against Incremental Learners: An Empirical Evaluation Study

Arxiv

0+阅读 · 2023年5月28日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

相关VIP内容

【硬核书】规划算法 (Planning Algorithm)，1023页pdf，Steven M. Illinois大学

【硬核书】规划算法 (Planning Algorithm)，1023页pdf，Steven M. Illinois大学

专知会员服务

165+阅读 · 2022年4月10日

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

【CMU-Paloma Sodhi博士论文】因子图的学习和推理与触觉感知的应用，Learning and Inference in Factor Graphs with Applications to Tactile Perception

专知会员服务

24+阅读 · 2022年3月10日

维多利亚运输政策研究所“Autonomous Vehicle Implementation Predictions：Implications for Transport Planning”（自动驾驶汽车实施预测:对交通规划的影响）

维多利亚运输政策研究所“Autonomous Vehicle Implementation Predictions：Implications for Transport Planning”（自动驾驶汽车实施预测:对交通规划的影响）

专知会员服务

17+阅读 · 2022年2月16日

【普林斯顿干货书】强化学习与随机优化，728页pdf阐述序列决策统一框架

【普林斯顿干货书】强化学习与随机优化，728页pdf阐述序列决策统一框架

专知会员服务

129+阅读 · 2021年4月25日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知会员服务

87+阅读 · 2020年8月28日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

【NeurIPS2019】模仿学习中的因果混乱问题 Causal Confusion in Imitation Learning

专知会员服务

30+阅读 · 2019年12月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《提升海洋边境安全：基于利益相关方互操作性的解决方案》

【ICML2025】SToFM：一种用于空间转录组学的多尺度基础模型

从战术边缘到全球覆盖：美陆军下一代指挥控制系统及其在联合全域指挥控制中的战略影响

《不确定环境下的多智能体规划》141页

相关资讯

253页PPT！《因果性Causality》教程，哥本哈根大学Jonas Peters讲授

253页PPT！《因果性Causality》教程，哥本哈根大学Jonas Peters讲授

专知

3+阅读 · 2022年7月7日

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

专知

18+阅读 · 2022年4月9日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【泡泡一分钟】神经SLAM：使用外部存储器让智能体学习探索环境

【泡泡一分钟】神经SLAM：使用外部存储器让智能体学习探索环境

泡泡机器人SLAM

12+阅读 · 2018年4月17日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Causal Estimation of User Learning in Personalized Systems

Arxiv

0+阅读 · 2023年6月1日

A Bayesian analysis of the time through the order penalty in baseball

Arxiv

0+阅读 · 2023年5月31日

Regulated Pure Pursuit for Robot Path Tracking

Regulated Pure Pursuit for Robot Path Tracking

Arxiv

0+阅读 · 2023年5月31日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年5月31日

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

Arxiv

0+阅读 · 2023年5月31日

XInsight: eXplainable Data Analysis Through The Lens of Causality

Arxiv

0+阅读 · 2023年5月30日

Backdoor Attacks Against Incremental Learners: An Empirical Evaluation Study

Arxiv

0+阅读 · 2023年5月28日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

未知环境中移动机器人探索式路径规划方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

面向复杂生产系统设计与控制的仿真优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

空间语义地图机器人自主在线构建方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向航空发动机受限问题的切换系统自适应控制

国家自然科学基金

0+阅读 · 2013年12月31日

复杂未建模系统的基于随机逼近的数据驱动控制研究

国家自然科学基金

2+阅读 · 2013年12月31日

Epimorphin调控的miR-107在肝癌侵袭和转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

外部干扰下自主式水下机器人推进器与导航传感器故障诊断方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

未校准环境下机器人自适应手眼视觉跟踪研究

国家自然科学基金

1+阅读 · 2012年12月31日

气化炉控制系统设计的随机方法

国家自然科学基金

0+阅读 · 2011年12月31日

基于轨迹灵敏度的模型预测紧急电压控制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员