Reinforcement learning (RL) requires skillful definition and remarkable computational efforts to solve optimization and control problems, which could impair its prospect. Introducing human guidance into reinforcement learning is a promising way to improve learning performance. In this paper, a comprehensive human guidance-based reinforcement learning framework is established. A novel prioritized experience replay mechanism that adapts to human guidance in the reinforcement learning process is proposed to boost the efficiency and performance of the reinforcement learning algorithm. To relieve the heavy workload on human participants, a behavior model is established based on an incremental online learning method to mimic human actions. We design two challenging autonomous driving tasks for evaluating the proposed algorithm. Experiments are conducted to access the training and testing performance and learning mechanism of the proposed algorithm. Comparative results against the state-of-the-art methods suggest the advantages of our algorithm in terms of learning efficiency, performance, and robustness.
翻译:强化学习(RL)需要熟练的定义和出色的计算努力,以解决可能损害其前景的优化和控制问题。在强化学习中引入人的指导是改善学习业绩的一个很有希望的方法。本文件建立了一个全面的基于人的指导强化学习框架。提出了一个新的优先经验重现机制,在强化学习过程中适应人的指导,以提高强化学习算法的效率和绩效。为了减轻人类参与者的沉重工作量,根据一种渐进的在线学习方法建立了一个行为模式,以模拟人类行为。我们设计了两种具有挑战性的自主驱动任务来评价拟议的算法。进行了实验,以获得培训,测试拟议算法的绩效和学习机制。对照最新方法的比较结果表明,我们算法在学习效率、绩效和稳健性方面具有优势。