To solve tasks in new environments involving objects unseen during training, agents must reason over prior information about those objects and their relations. We introduce the Prior Knowledge Graph network, an architecture for combining prior information, structured as a knowledge graph, with a symbolic parsing of the visual scene, and demonstrate that this approach is able to apply learned relations to novel objects whereas the baseline algorithms fail. Ablation experiments show that the agents ground the knowledge graph relations to semantically-relevant behaviors. In both a Sokoban game and the more complex Pacman environment, our network is also more sample efficient than the baselines, reaching the same performance in 5-10x fewer episodes. Once the agents are trained with our approach, we can manipulate agent behavior by modifying the knowledge graph in semantically meaningful ways. These results suggest that our network provides a framework for agents to reason over structured knowledge graphs while still leveraging gradient based learning approaches.
翻译:为了在新的环境中解决涉及培训过程中看不见的物体的任务,代理商必须比事先了解有关这些物体及其关系的信息更清楚。 我们引入了前知识图网络,这是一个将先前信息合并起来的架构,作为知识图,对视觉场景进行象征性的解析,并表明这种方法能够将学到的关系应用到新对象,而基线算法却失败了。 模拟实验显示,代理商将知识图关系植入到与语义有关的行为中。 在Sokoban游戏和更为复杂的Pacman环境中,我们的网络也比基线更具抽样效率,在5-10x次中达到同样的性能。 一旦这些代理商接受了我们的方法培训,我们就可以通过以具有语义意义的方式修改知识图来操纵代理商的行为。 这些结果表明,我们的网络为代理商提供了一个框架,可以对结构化的知识图进行思考,同时仍然利用梯度学习方法。