The technological and scientific challenges involved in the development of autonomous vehicles (AVs) are currently of primary interest for many automobile companies and research labs. However, human-controlled vehicles are likely to remain on the roads for several decades to come and may share with AVs the traffic environments of the future. In such mixed environments, AVs should deploy human-like driving policies and negotiation skills to enable smooth traffic flow. To generate automated human-like driving policies, we introduce a model-free, deep reinforcement learning approach to imitate an experienced human driver's behavior. We study a static obstacle avoidance task on a two-lane highway road in simulation (Unity). Our control algorithm receives a stochastic feedback signal from two sources: a model-driven part, encoding simple driving rules, such as lane-keeping and speed control, and a stochastic, data-driven part, incorporating human expert knowledge from driving data. To assess the similarity between machine and human driving, we model distributions of track position and speed as Gaussian processes. We demonstrate that our approach leads to human-like driving policies.
翻译:许多汽车公司和研究实验室目前对开发自主车辆(AV)所涉及的技术和科学挑战最感兴趣,然而,人类控制车辆很可能在今后几十年内仍停留在道路上,并可能与AV共享未来的交通环境。在这种混合的环境中,AV应采用人性化驾驶政策和谈判技巧,以方便交通畅通。为了制定自动的像人一样的驾驶政策,我们采用了一种没有模型的、深入强化的学习方法,以模仿有经验的人类驾驶者的行为。我们在模拟(Unity)中研究了一条双车道公路上的静态障碍避免任务。我们的控制算法从两个来源收到一个随机反馈信号:一个模式驱动部分,编码简单的驾驶规则,例如车道和速度控制,以及一个随机、数据驱动部分,将驾驶数据方面的人类专家知识纳入其中。为了评估机器和人类驾驶方法的相似性,我们模拟了与Gaussian过程的轨道位置和速度的分布。我们证明我们的方法导致人性驾驶政策。