利用人力援助改进强化学习:与HIPPO Gym一起为人类主题研究辩护 (Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym)

Reinforcement learning (RL) is a popular machine learning paradigm for game playing, robotics control, and other sequential decision tasks. However, RL agents often have long learning times with high data requirements because they begin by acting randomly. In order to better learn in complex tasks, this article argues that an external teacher can often significantly help the RL agent learn. OpenAI Gym is a common framework for RL research, including a large number of standard environments and agents, making RL research significantly more accessible. This article introduces our new open-source RL framework, the Human Input Parsing Platform for Openai Gym (HIPPO Gym), and the design decisions that went into its creation. The goal of this platform is to facilitate human-RL research, again lowering the bar so that more researchers can quickly investigate different ways that human teachers could assist RL agents, including learning from demonstrations, learning from feedback, or curriculum learning.

翻译：强化学习(RL)是游戏游戏、机器人控制和其他相继决策任务的流行机器学习模式。然而,RL代理商往往有很长的学习时间,因为数据要求很高,因为他们首先随机行事。为了更好地学习复杂的任务,文章认为外部教师往往能大大帮助RL代理商学习。OpenAI Gym是RL研究的共同框架,包括许多标准环境和代理商,使得RL研究更容易获得。这篇文章介绍了我们新的开放源码RL框架,即Openai Gym(HIPPO Gym)的人类输入分析平台(HIPO Gym),以及创建该平台的设计决定。这个平台的目的是为人类-RL研究提供便利,再次降低标准,以便更多的研究人员能够快速调查人类教师协助RL代理商的不同方法,包括学习演示、学习反馈或课程学习。