部分观测分段确定性马尔可夫过程脉冲控制问题的数值求解方法 (Numerical method to solve impulse control problems for partially observed piecewise deterministic Markov processes) - 专知论文

会员服务 ·

0

分段 · 马尔可夫过程 · 脉冲 · 脉冲控制 · 控制问题 ·

Numerical method to solve impulse control problems for partially observed piecewise deterministic Markov processes

翻译：部分观测分段确定性马尔可夫过程脉冲控制问题的数值求解方法

Alice Cleynen,Benoîte de Saporta

Designing efficient and rigorous numerical methods for sequential decision-making under uncertainty is a difficult problem that arises in many applications frameworks. In this paper we focus on the numerical solution of a subclass of impulse control problem for piecewise deterministic Markov process (PDMP) when the jump times are hidden. We first state the problem as a partially observed Markov decision process (POMDP) on a continuous state space and with controlled transition kernels corresponding to some specific skeleton chains of the PDMP. Then we proceed to build a numerically tractable approximation of the POMDP by tailor-made discretizations of the state spaces. The main difficulty in evaluating the discretization error comes from the possible random jumps of the PDMP between consecutive epochs of the POMDP and requires special care. Finally we discuss the practical construction of discretization grids and illustrate our method on simulations.

翻译：为不确定性下的序贯决策设计高效且严谨的数值方法是一个在众多应用框架中出现的难题。本文聚焦于分段确定性马尔可夫过程（PDMP）在跳跃时间不可观测情形下的一类脉冲控制问题的数值求解。我们首先将该问题表述为连续状态空间上的部分观测马尔可夫决策过程（POMDP），其受控转移核对应于PDMP的特定骨架链。随后，通过定制化的状态空间离散化，构建了该POMDP的数值可处理近似。离散化误差评估的主要困难源于PDMP在POMDP连续时段之间可能发生的随机跳跃，这需要特别处理。最后，我们讨论了离散化网格的实际构建方法，并通过仿真示例展示了所提方法。

0

相关内容

深度线性神经网络的梯度流方程：一项基于网络视角的综述

深度线性神经网络的梯度流方程：一项基于网络视角的综述

专知会员服务

8+阅读 · 11月14日

【ICML2025】时序分布漂移下的自适应估计与学习

【ICML2025】时序分布漂移下的自适应估计与学习

专知会员服务

12+阅读 · 5月25日

【ICML2021】元学习的分布依赖分析

专知会员服务

19+阅读 · 2021年8月15日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

专知会员服务

85+阅读 · 2020年6月9日

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

专知

50+阅读 · 2022年6月2日

【斯坦福CS520】向量空间中嵌入的知识图谱推理，48页ppt

【斯坦福CS520】向量空间中嵌入的知识图谱推理，48页ppt

专知

24+阅读 · 2020年6月11日

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

专知

12+阅读 · 2020年6月9日

论文浅尝 | ICLR2020 - 基于组合的多关系图卷积网络

论文浅尝 | ICLR2020 - 基于组合的多关系图卷积网络

开放知识图谱

21+阅读 · 2020年4月24日

【NeurIPS 2019】vGraph：联合节点检测与节点表示生成模型

【NeurIPS 2019】vGraph：联合节点检测与节点表示生成模型

专知

23+阅读 · 2019年12月21日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

M-矩阵（张量）最小特征值估计及其相关问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

非线性系统输入状态稳定性分析与设计的不定向量Lyapunov函数导数方法

国家自然科学基金

0+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

Deterministic Global Optimization of the Acquisition Function in Bayesian Optimization: To Do or Not To Do?

Arxiv

0+阅读 · 12月17日

Gaussian approximations for fast Bayesian inference of partially observed branching processes with applications to epidemiology

Arxiv

0+阅读 · 11月28日

Modeling group heterogeneity in spatio-temporal data via physics-informed semiparametric regression

Arxiv

0+阅读 · 11月17日

An abstract fixed-point theorem for Horn formula equations

Arxiv

0+阅读 · 11月11日

Mixed precision multigrid with smoothing based on incomplete Cholesky factorization

Arxiv

0+阅读 · 11月6日

VIP会员

文章信息

相关主题

马尔可夫过程

相关VIP内容

深度线性神经网络的梯度流方程：一项基于网络视角的综述

深度线性神经网络的梯度流方程：一项基于网络视角的综述

专知会员服务

8+阅读 · 11月14日

【ICML2025】时序分布漂移下的自适应估计与学习

【ICML2025】时序分布漂移下的自适应估计与学习

专知会员服务

12+阅读 · 5月25日

【ICML2021】元学习的分布依赖分析

专知会员服务

19+阅读 · 2021年8月15日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

专知会员服务

85+阅读 · 2020年6月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能对军事行动进行建模》

《利用人工智能学习、优化与推演美国海军作战部队的战略布局与分散（续文）》

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

《指挥官意图消息中关键概念自动提取》最新47页

相关资讯

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

专知

50+阅读 · 2022年6月2日

【斯坦福CS520】向量空间中嵌入的知识图谱推理，48页ppt

【斯坦福CS520】向量空间中嵌入的知识图谱推理，48页ppt

专知

24+阅读 · 2020年6月11日

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

【CVPR2020-北京大学】自适应间隔损失的提升小样本学习

专知

12+阅读 · 2020年6月9日

论文浅尝 | ICLR2020 - 基于组合的多关系图卷积网络

论文浅尝 | ICLR2020 - 基于组合的多关系图卷积网络

开放知识图谱

21+阅读 · 2020年4月24日

【NeurIPS 2019】vGraph：联合节点检测与节点表示生成模型

【NeurIPS 2019】vGraph：联合节点检测与节点表示生成模型

专知

23+阅读 · 2019年12月21日

相关论文

Deterministic Global Optimization of the Acquisition Function in Bayesian Optimization: To Do or Not To Do?

Arxiv

0+阅读 · 12月17日

Gaussian approximations for fast Bayesian inference of partially observed branching processes with applications to epidemiology

Arxiv

0+阅读 · 11月28日

Modeling group heterogeneity in spatio-temporal data via physics-informed semiparametric regression

Arxiv

0+阅读 · 11月17日

An abstract fixed-point theorem for Horn formula equations

Arxiv

0+阅读 · 11月11日

Mixed precision multigrid with smoothing based on incomplete Cholesky factorization

Arxiv

0+阅读 · 11月6日

相关基金

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

M-矩阵（张量）最小特征值估计及其相关问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

非线性系统输入状态稳定性分析与设计的不定向量Lyapunov函数导数方法

国家自然科学基金

0+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员