政策优化与线性时时逻辑限制 (Policy Optimization with Linear Temporal Logic Constraints) - 专知论文

会员服务 ·

0

优化器 · 线性的 · 代价 · 约束 · ENJOY ·

2022 年 10 月 19 日

Policy Optimization with Linear Temporal Logic Constraints

翻译：政策优化与线性时时逻辑限制

Cameron Voloshin,Hoang M. Le,Swarat Chaudhuri,Yisong Yue

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. The language of LTL allows flexible description of tasks that may be unnatural to encode as a scalar cost function. We consider LTL-constrained PO as a systematic framework, decoupling task specification from policy selection, and as an alternative to the standard of cost shaping. With access to a generative model, we develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality (through a reduction to a reachability problem). Empirically, our algorithm can achieve strong performance even in low-sample regimes.

翻译：我们用线性时间逻辑(LTL)限制来研究政策优化问题,LTL的语言允许灵活地描述可能非自然的任务,将任务编码为计算成本功能。我们认为受LTL限制的PO是一个系统框架,将任务规格与政策选择脱钩,并作为成本形成标准的一种替代。有了基因化模型,我们开发了一个基于模型的方法,通过抽样的复杂性分析,保证任务满意度和成本最佳性(通过降低到可达性问题 ) 。

0

相关内容

优化器

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

可压缩湍流粒子输运的拉格朗日（Lagrangian）研究

国家自然科学基金

0+阅读 · 2013年12月31日

数控装备运行可靠性的多源时变耦合建模与实时评估研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

隧道围岩建模与动态更新的不确定性方法

国家自然科学基金

0+阅读 · 2012年12月31日

焊接结构本征疲劳裂纹扩展速率研究

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

Fast gap-filling of massive data by local-equilibrium conditional simulations on GPU

Arxiv

0+阅读 · 2022年12月2日

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Arxiv

0+阅读 · 2022年12月2日

Pre-averaging fractional processes contaminated by noise, with an application to turbulence

Arxiv

0+阅读 · 2022年12月1日

Game Implementation: What Are the Obstructions?

Arxiv

0+阅读 · 2022年12月1日

Learning Agile Paths from Optimal Control

Arxiv

0+阅读 · 2022年11月30日

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Arxiv

0+阅读 · 2022年11月30日

Monitoring Arithmetic Temporal Properties on Finite Traces

Arxiv

0+阅读 · 2022年11月30日

Policy Optimization over General State and Action Spaces

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Arxiv

0+阅读 · 2022年11月30日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

相关VIP内容

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新，DeepSeek-R1论文登上Nature封面，附83页补充材料

人工智能与未来战争

自动驾驶中的轨迹预测大型基础模型：全面综述

万字长文《对抗雷达系统的电子战综述》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Fast gap-filling of massive data by local-equilibrium conditional simulations on GPU

Arxiv

0+阅读 · 2022年12月2日

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Arxiv

0+阅读 · 2022年12月2日

Pre-averaging fractional processes contaminated by noise, with an application to turbulence

Arxiv

0+阅读 · 2022年12月1日

Game Implementation: What Are the Obstructions?

Arxiv

0+阅读 · 2022年12月1日

Learning Agile Paths from Optimal Control

Arxiv

0+阅读 · 2022年11月30日

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Arxiv

0+阅读 · 2022年11月30日

Monitoring Arithmetic Temporal Properties on Finite Traces

Arxiv

0+阅读 · 2022年11月30日

Policy Optimization over General State and Action Spaces

Arxiv

0+阅读 · 2022年11月30日

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Arxiv

0+阅读 · 2022年11月30日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

可压缩湍流粒子输运的拉格朗日（Lagrangian）研究

国家自然科学基金

0+阅读 · 2013年12月31日

数控装备运行可靠性的多源时变耦合建模与实时评估研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

隧道围岩建模与动态更新的不确定性方法

国家自然科学基金

0+阅读 · 2012年12月31日

焊接结构本征疲劳裂纹扩展速率研究

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员