如何不驱动:学习示范的驱动限制 (How To Not Drive: Learning Driving Constraints from Demonstration) - 专知论文

会员服务 ·

0

代价函数 · 学成 · 约束 · 优化器 · 泛函 ·

2021 年 10 月 1 日

How To Not Drive: Learning Driving Constraints from Demonstration

翻译：如何不驱动:学习示范的驱动限制

Kasra Rezaee,Peyman Yadmellat

from arxiv, Submitted to the IEEE International Conference on Robotics and Automation (ICRA) 2022

We propose a new scheme to learn motion planning constraints from human driving trajectories. Behavioral and motion planning are the key components in an autonomous driving system. The behavioral planning is responsible for high-level decision making required to follow traffic rules and interact with other road participants. The motion planner role is to generate feasible, safe trajectories for a self-driving vehicle to follow. The trajectories are generated through an optimization scheme to optimize a cost function based on metrics related to smoothness, movability, and comfort, and subject to a set of constraints derived from the planned behavior, safety considerations, and feasibility. A common practice is to manually design the cost function and constraints. Recent work has investigated learning the cost function from human driving demonstrations. While effective, the practical application of such approaches is still questionable in autonomous driving. In contrast, this paper focuses on learning driving constraints, which can be used as an add-on module to existing autonomous driving solutions. To learn the constraint, the planning problem is formulated as a constrained Markov Decision Process, whose elements are assumed to be known except the constraints. The constraints are then learned by learning the distribution of expert trajectories and estimating the probability of optimal trajectories belonging to the learned distribution. The proposed scheme is evaluated using NGSIM dataset, yielding less than 1\% collision rate and out of road maneuvers when the learned constraints is used in an optimization-based motion planner.

翻译：行为和运动规划是自主驾驶系统的关键组成部分。行为规划是自主设计成本功能和限制的常见做法。最近的工作调查了从人驾驶演示中学习成本功能。虽然这种方法的实际应用在自主驾驶中仍然有问题。与此相反,本文的重点是学习驾驶限制,这可以用作现有自主驾驶解决方案的附加模块。为了了解制约因素,规划问题被发展成一个制约的Markov决定程序,其要素被假定为除制约之外的其他要素。随后,通过学习如何使用最佳机动性计划来评估收益率。

0

相关内容

代价函数

在数学优化，统计学，计量经济学，决策理论，机器学习和计算神经科学中，代价函数，又叫损失函数或成本函数，它是将一个或多个变量的事件阈值映射到直观地表示与该事件。一个优化问题试图最小化损失函数。目标函数是损失函数或其负值，在这种情况下它将被最大化。

耶鲁大学《分布式系统理论》笔记，491页pdf

耶鲁大学《分布式系统理论》笔记，491页pdf

专知会员服务

46+阅读 · 2020年7月29日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【IJCAI 2019】可扩展的深度学习:从理论到实践（Scalable Deep Learning: from theory to practice），Decebal Constantin Mocanu，Elena Mocanu

【IJCAI 2019】可扩展的深度学习:从理论到实践（Scalable Deep Learning: from theory to practice），Decebal Constantin Mocanu，Elena Mocanu

专知会员服务

16+阅读 · 2019年8月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hands-on Machine Learning with Scikit-Learn and TensorFlow 学习笔记

Hands-on Machine Learning with Scikit-Learn and TensorFlow 学习笔记

AINLP

12+阅读 · 2018年11月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

Arxiv

0+阅读 · 2021年11月25日

Optimal Probabilistic Motion Planning with Potential Infeasible LTL Constraints

Arxiv

0+阅读 · 2021年11月25日

Fault-Tolerant Perception for Automated Driving A Lightweight Monitoring Approach

Arxiv

0+阅读 · 2021年11月24日

Best Arm Identification with Safety Constraints

Arxiv

0+阅读 · 2021年11月23日

Learning Interactive Driving Policies via Data-driven Simulation

Arxiv

0+阅读 · 2021年11月23日

Contrastive Active Inference

Arxiv

4+阅读 · 2021年10月19日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

耶鲁大学《分布式系统理论》笔记，491页pdf

耶鲁大学《分布式系统理论》笔记，491页pdf

专知会员服务

46+阅读 · 2020年7月29日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【IJCAI 2019】可扩展的深度学习:从理论到实践（Scalable Deep Learning: from theory to practice），Decebal Constantin Mocanu，Elena Mocanu

【IJCAI 2019】可扩展的深度学习:从理论到实践（Scalable Deep Learning: from theory to practice），Decebal Constantin Mocanu，Elena Mocanu

专知会员服务

16+阅读 · 2019年8月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《概率数值计算：贝叶斯求积法与人机协作》最新博士论文

【NTU博士论文】多模态神经三维资产合成

人工智能：实时战斗适应

《运用作战人员数字孪生与生成式人工智能预测任务成果》最新文献

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hands-on Machine Learning with Scikit-Learn and TensorFlow 学习笔记

Hands-on Machine Learning with Scikit-Learn and TensorFlow 学习笔记

AINLP

12+阅读 · 2018年11月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

Arxiv

0+阅读 · 2021年11月25日

Optimal Probabilistic Motion Planning with Potential Infeasible LTL Constraints

Arxiv

0+阅读 · 2021年11月25日

Fault-Tolerant Perception for Automated Driving A Lightweight Monitoring Approach

Arxiv

0+阅读 · 2021年11月24日

Best Arm Identification with Safety Constraints

Arxiv

0+阅读 · 2021年11月23日

Learning Interactive Driving Policies via Data-driven Simulation

Arxiv

0+阅读 · 2021年11月23日

Contrastive Active Inference

Arxiv

4+阅读 · 2021年10月19日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员