Deep reinforcement learning (RL) has achieved breakthrough results on many tasks, but agents often fail to generalize beyond the environment they were trained in. As a result, deep RL algorithms that promote generalization are receiving increasing attention. However, works in this area use a wide variety of tasks and experimental setups for evaluation. The literature lacks a controlled assessment of the merits of different generalization schemes. Our aim is to catalyze community-wide progress on generalization in deep RL. To this end, we present a benchmark and experimental protocol, and conduct a systematic empirical study. Our framework contains a diverse set of environments, our methodology covers both in-distribution and out-of-distribution generalization, and our evaluation includes deep RL algorithms that specifically tackle generalization. Our key finding is that `vanilla' deep RL algorithms generalize better than specialized schemes that were proposed specifically to tackle generalization.
翻译:深入强化学习(RL)在许多任务上取得了突破性成果,但代理商往往未能超越他们所培训的环境范围加以推广。因此,促进普及化的深入RL算法日益受到重视。然而,这一领域的工作使用各种各样的任务和实验设置来进行评价。文献缺乏对不同概括化计划的优点的有控制的评估。我们的目标是在深入的RL中促进全社区在普及化方面的进展。为此目的,我们提出了一个基准和实验协议,并进行系统的经验性研究。我们的框架包含一套不同的环境,我们的方法涵盖分布和分配外的普及化,我们的评估包括专门处理普及化的深入RL算法。我们的主要发现是,“Vanilla”深度RL算法比专门为处理普及化而提出的专门计划要好得多。